Data bug with Google Search Console's Search Analytics report

Google has posted about a data anomalies bug with the Search Analytics report found in the Google Search Console. The specific issue shows up when you use the “AMP non-rich results” search appearance filter and look at the clicks and impressions between December 14, 2017, and December 18, 2017.

Google said there “was an error in counting AMP non-rich results impressions and clicks” between those dates and you “might see a drop in your data during this period.” It did not impact the actual search results; it was just an analytics bug.

Here is what the report might look like for you:

The data should return to normal on or after December 19, 2017, but those few days will have some inaccurate data.

Google offers advice on how to get ready for the mobile-first index

Image Credit: Denys Prykhodov / Shutterstock.com

Google has posted on the webmaster blog more advice around getting ready for the mobile-first index.

Google confirmed it has rolled out the mobile-first index “for a handful of sites” and said the search team is “closely” monitoring those sites for testing purposes.

You will know when your site moved over by checking to see a significantly increased crawling rate by the Smartphone Googlebot in your log files and the snippets in the results, as well as the content on the Google cache pages, will be from the mobile version of your web pages. Again, Google said only a small number of sites have migrated.

Gary Illyes from Google posted several tips to get ready for the mobile-first index:

Make sure the mobile version of the site also has the important, high-quality content. This includes text, images (with alt-attributes), and videos — in the usual crawlable and indexable formats.Structured data is important for indexing and search features that users love: It should be both on the mobile and desktop version of the site. Ensure URLs within the structured data are updated to the mobile version on the mobile pages.Metadata should be present on both versions of the site. It provides hints about the content on a page for indexing and serving. For example, make sure that titles and meta descriptions are equivalent across both versions of all pages on the site.No changes are necessary for interlinking with separate mobile URLs (m.-dot sites). For sites using separate mobile URLs, keep the existing link rel=canonical and link rel=alternate elements between these versions.Check hreflang links on separate mobile URLs. When using link rel=hreflang elements for internationalization, link between mobile and desktop URLs separately. Your mobile URLs’ hreflang should point to the other language/region versions on other mobile URLs, and similarly link desktop with other desktop URLs using hreflang link elements there.Ensure the servers hosting the site have enough capacity to handle potentially increased crawl rate. This doesn’t affect sites that use responsive web design and dynamic serving, only sites where the mobile version is on a separate host, such as m.example.com.

For more information, check out our mobile-first index FAQs.

Optimizing for Hanukkah: Sometimes it’s still strings, not things

My wife came to me with a problem. She wanted festive, whimsical, and potentially matching Hanukkah pajamas. But there weren’t enough options coming up in Google under one spelling of the holiday’s name, so she told me she was systematically going through all spellings to compile her list of shopping items.

I was pretty surprised by this — I had expected Google to be smart enough to recognize that these were alternative spellings of the same thing, especially post-Hummingbird. Clearly, this was not the case.

Some background for those who don’t know: Hanukkah is actually a transliterated word from Hebrew. Since Hebrew has its own alphabet, there are numerous spellings that one can use to reference it: Hanukkah, Chanukah, and Channukah are all acceptable spellings of the same holiday.

So, when someone searches for “Hanukkah pajamas” or “Chanukah pajamas,” Google really should be smart enough to understand that they are different spellings of the same concept and provide nearly identical results. But Google does not! I imagine this happens for other holidays and names from other cultures, and I’d be curious to know if other readers experience the same problem with those.

Why am I surprised that Google is returning different results for different spellings? Well, with the introduction of the Knowledge Graph (and Hummingbird), Google signaled a change for SEO. More than ever before, we could start thinking about search queries not merely as keyword strings, but as interrelated real-world concepts.

What do I mean by this?

When someone searches for “Abraham Lincoln,” they’re more than likely searching for the entity representing the 16th president of the United States, rather than the appearance of the words “Abraham” and “Lincoln,” or their uncle, also named Abraham Lincoln. And if they search for “Lincoln party,” Google knows we’re likely discussing political parties, rather than parties in the town of Lincoln, Mass., because this is a concept in close association with the historical entity Abraham Lincoln.

Similarly, Google is certainly capable of understanding that when we use the keyword Hanukkah, it is in reference to the holiday entity and that the various spellings are also referring to the same entity. Despite different spellings, the different searches actually mean the same thing. But alas, as demonstrated by my wife’s need to run a different search for each spelling of the holiday in order to discover all of her Hanukkah pajama options, Google wasn’t doing the best job.

So, how widespread is the Chanukah/Hanukkah/Chanukkah search problem? Here are a couple of search results for Chanukah items:

As you can see from the first screen shot, some big box retailers like Target, Macy’s and JCPenney rank on page one of Google. In screen shot two, however, they are largely absent — and sites like PajamaGram and Etsy are dominating the different spelling’s SERP.

This means that stores targeting the already small demographic of Hanukkah shoppers are actually reducing the number of potential customers by only using one spelling on their page. (Indeed, according to my keyword tool of choice, although “Hanukkah” has the highest search volume of all variants at 301,100 global monthly searches, all other spellings combined still make up a sizeable 55,500 searches — meaning that retailers optimizing for both terms could be seeing 18 percent more traffic.)

Investigating spelling variations and observations

Since I’m an ever-curious person, I wanted to investigate this phenomenon a little further.

I built a small, simple tool to show how similar the search engine results pages (SERP) for two different queries are by examining which listings appear in both SERPs. If we look at five common spellings of Hanukkah, we see the following:

Keyword 1Keyword 2SERP SimilarityChannukahChanukah90.00%ChannukahHannukah20.00%ChannukahHannukkah20.00%ChannukahHanukkah30.00%ChanukahHannukah20.00%ChanukahHannukkah20.00%ChanukahHanukkah30.00%HannukahHannukkah90.00%HannukahHanukkah80.00%HannukkahHanukkah80.00%

The tool shows something quite interesting here: Not only are the results different, but depending on spelling, the results may only be 20 percent identical, meaning eight out of 10 of the listings on page one are completely different.

I then became curious about why the terms weren’t canonicalized to each other, so I looked at Wikidata, one of the primary data sources that Google uses for its Knowledge Graph. As it turns out, there is an entity with all of the variants accounted for:

I then checked the Google Knowledge Graph Search API, and it became very clear that Google may be confused:

KeywordresultScore@idnameDescription@typeChannukah8.081924kg:/m/0vpq52Channukah LoveSong by Ju-Tang[MusicRecording, Thing]Chanukah16.334606kg:/m/06xmqp_A Rugrats Chanukah?[Thing]Hannukah11.404715kg:/m/0zvjvwtHannukahSong by Lorna[MusicRecording, Thing]Hannukkah11.599854kg:/m/06vrjy9HannukkahBook by Jennifer Blizin Gillis[Book, Thing]Hanukkah21.56493kg:/m/02873zHanukkah HarryFictional character[Thing]

The resultScore values — which, according to the API documentation, indicate “how well the entity matched the request constraints” — are very low. In this case, the entity wasn’t very well matched. This would be consistent with the varying results if it weren’t for the fact that a Knowledge Graph is being returned for all of the spelling variants with the Freebase ID /m/022w4 — different from what is returned from the Knowledge Graph API. So, in this case, it seems that the API may not be a reliable means of assessing the problem. Let’s move on to some other observations.

It is interesting to note was that when searching for Channukah, Google pushed users to Chanukah results. When searching Hannukah and Hannukkah, Google pushed users to Hanukkah results. So, Google does seem to group Hanukkah spellings together based on whether they start with an “H” or a “Ch.”

Chanukah, Hannukah, and Hanukkah were also the only variations that received the special treatment of the Hanukkah menorah graphic:

What a retailer selling Hanukkah products should do

Clearly, if we want full coverage of terms (and my wife to find your Hanukkah pajamas), we cannot rely on just optimizing for the highest search volume variation of the keyword, as Google doesn’t seem to view all variants as entirely the same. Your best bet is to include the actual string for each spelling variant somewhere on the page, rather than relying on Google to understand them as variations of the same thing.

If you’re a smaller player, it may make sense to prioritize optimizations toward one of the less popular spelling variants, as the organic competition may not be as significant. (Of course, this does not bar you from using spelling variants in addition to that for the potential of winning for multiple spellings.)

At a bare minimum, you may opt to include a spelling beginning with H- and Ch- and hope that Google will direct users to the same SERP in most cases.

Future experiment

I started an experiment to see whether the inclusion of structured data with sameAs properties may be a potential avenue for getting Google to understand a single spelling as an entity, eliminating the need to include different spelling variations. As of now, it’s a little too early to know the results of the test, and they are inconclusive, but I look forward to sharing those results in the future.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

How to generate links that drive traffic, not just ranking

Many people see link building as a way to drive rankings. But, when done correctly, it can (and should) also drive traffic.

Driving traffic has a lot of benefits beyond the obvious potential increase in leads and sales. More website traffic can provide valuable analytics data about what users are looking for and what confuses them. It can also help grow engagement and potentially referral links on social media as others begin to share our content.

In this column, I’ll explain how to identify sources of links that drive actual traffic and how to evaluate your progress so that you can focus your efforts where they will have the greatest impact.

Identifying link partners

In order to find good sources for traffic-driving links, there are a few ways you can go: competitor research, rankings and influencers.

First, find the publications driving traffic to your competitors by using tools like SimilarWeb to find their top referral sources. Not only do these tools tell you who is linking to your competitors, but some can also show how much traffic your competitors are getting from those links.

Any site driving traffic/referrals to your competitors should be investigated and evaluated as a potential linking partner. Check each one for quality, verifying that they aren’t content scraper sites and are actually valuable resources for your target audience. If they pass the test, then consider approaching them for a link.

Of course, you shouldn’t just pursue links from sites that are driving traffic to your competitors. Review the top-ranking websites in Google for the terms you want to rank for and see if any of them can serve as good linking partners. For example, many industries have vertical-specific directories that provide both free and sponsored listings.

As always, do your research when approaching sites like this. Do the directories seem spammy, designed only to generate links for SEO purposes? Or are they legitimate sites that consumers actually use, like Yelp, TripAdvisor or Avvo? (Note that links from legitimate sites will often be nofollowed, but they are still valuable because they drive real traffic.)

If you want to do more of the heavy lifting when it comes to content, try approaching major and niche industry outlets that you can contribute content to. In addition to the above sites you found during your research, use a tool like BuzzSumo to find social influencers and reach out to them on their social channels or via email to see if they accept guest posts. These posts need to be highly relevant to the website’s audience, and be careful to follow any editorial guidelines and respect their rules for submitted content.

In addition to smaller industry publications, you can also find guest posting opportunities on major sites like Inc.com through their guest posting forms. The byline link or the author page can be a great source of traffic and referrals. Often, I’ve gotten leads from these links just because the prospect was impressed with seeing the byline in major outlets. However, you must be diligent and careful here: Submit your best work, as inclusion is often competitive, and editors can therefore be extremely choosy.

Other great outlet options to consider are community forums, like industry-specific subreddits or sites like Inbound.org if you are in marketing. Just remember to be a good community member — never spam other users with your own content, and be sure to participate regularly by answering questions and commenting thoughtfully on others’ content.

One last angle to try is to find industry influencers and sponsor or partner with them. Many influencers are willing to enter into partnerships with brands, where they will review or work with a company on content and social media posts to get the brand’s name out to their audience. Cost usually varies with audience size and the scope of the campaign.

Since the aim here is to drive traffic and branding, you shouldn’t run into any issues regarding Google’s linking guidelines. However, it’s important to ensure that all financial relationships are disclosed according to FTC guidelines and that you aren’t attempting to hide or sneak links into any content that you are sending to these outlets for publication.

Evaluating success

Once you’ve approached your chosen link partners and successfully obtained links, it’s time to review your work. After each month, check Google Analytics for referral traffic to see which new sites you’ve worked with are actually bringing you traffic. After three to six months, you’ll have a clear picture of which sites are worth your time and which aren’t. For instance, if Inc.com is bringing you more traffic than three industry sites combined, it might be better to pare down your industry sites to be able to submit more content to Inc.com.

Additionally, you can also see if there is an increase in overall brand search for your name using Google trends or Google Keyword Planner. Often, branding campaigns can result in more direct traffic, as well as organic traffic due to an increase in branded searches. By carefully tracking increases in direct and branded organic referrals, you can see the impact your branding campaigns are having. This can help you see the long-term benefits of your link-building efforts in growing your website traffic.

While tracking the data, be sure to also track your success building relationships with the influencers and websites you’ve singled out as potential link-building partners. This can show your progress to management and help you hone your pitch and messaging style.

Final thoughts

Link building, no matter the goal, is hard work if you want it to be done ethically and with enduring value. Building a healthy link portfolio can help you generate traffic from a wide variety of referral sources, while also increasing your overall online presence and making sure you own more of your branded search terms. Be sure to cast a wide net by working with many different sites and platforms.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

Bing announces AI-powered new 'intelligent search' features

At Microsoft’s AI event yesterday in San Francisco, the company showcased its vision for AI-enabled computing, as well as its AI differentiation strategy. The latter essentially boils down to three big ideas: making AI-supported software broadly accessible to people to improve “everyday” experiences, the seamless combining of work and personal functionality in the same tools and the intention to be an ethical AI company.

Microsoft showed how AI and machine learning are now supporting its marquee products, from Windows to Office 365 and Bing. The most impressive demonstration of the day (from a self-interested perspective) featured AI-guided and automated design suggestions in PowerPoint.

There were several Bing-centric AI announcements, all under the heading of “intelligent search“:

Intelligent AnswersIntelligent Image SearchConversational Search

Intelligent Answers

Think of this as a kind of “next-gen Featured Snippets.” But what is different and interesting is that Bing is often summarizing or comparing multiple sources of information rather than just presenting a single answer.

If there are competing perspectives on an issue, for example, Bing will present them. It will also provide a “carousel of intelligent answers” if there are multiple answers to a question. This is intended to replace “blue links” and provide quick access to relevant information.

Below is a Bing-provided example of a comparison involving two different content sources on the question, “Is kale good for you?”

Intelligent Image Search

Here Bing is essentially doing what Pinterest announced in 2016 with “visual search” and object recognition. Bing is seeking to make virtually any image “shoppable.” Right now, that capability is focused on fashion and home furniture.

Users can “click the magnifying glass icon on the top right of any image to search within an image and find related images or products.” The example below illustrates how it works.

Bing can also detect and identify buildings and landmarks in user photos or in image search — though not yet in the real world.

Google Lens offers visual search for objects and places in the real world (so does Amazon, for products). I would anticipate that soon Bing will introduce a similar Lens-like capability through Cortana or its search app.

Conversational Search

Bing is taking search suggest/autocomplete to a new level with what it’s calling “conversational search.” From a very general or vague query, Bing will help with query refinement suggestions:

Now if you need help figuring out the right question to ask, Bing will help you with clarifying questions based on your query to better refine your search and get you the best answer the first time around. You’ll start to see this experience in health, tech and sports queries, and we will be adding more topic areas over time. And because we’ve built it with large-scale machine learning, the experience will get better over time as more users engage with it.

Finally, the company also announced the integration of Reddit content (answers/opinions) into Bing. Tim Peterson wrote about that in more detail yesterday. In short, however, Bing is going to show snippets of Reddit content or conversations when it believes that’s the best source of information.

Microsoft will also promote AMAs in search results and in knowledge panels: “On Bing you can discover AMA schedules and see snapshots of AMAs that have already been completed. Simply search a person’s name to see their AMA snapshot or search for ‘Reddit AMAs’ to see a carousel of popular AMAs.”

It’s unlikely that any of these changes will move the needle on market share in the short term. However, collectively they show an AI-driven acceleration of changes in search overall. Google will probably be compelled to answer a couple of the new Microsoft features.

If Microsoft truly wants to convert more users, it will need to be even bolder with features, content and UI changes. And the company is in a fairly strong position to be disruptive because it doesn’t rely on search-ad revenue to the extent that Google does.

Google: Fundamentals of writing meta descriptions don't change with longer search snippets

Earlier this month, Google confirmed they have extended the search results snippets from 160 characters all the way to a maximum of 320 characters long. Google told Search Engine Land that even though the snippets can be longer, the “fundamentals of writing a description tag” have not changed.

Google may or may not show 320 characters; Google may or may not show your meta description; and Google may or may not show content from your page. A lot of how Google decides what search result snippet to show is based on the searchers’ query and the content on your page. A Google spokesperson told us “there’s no need for publishers to suddenly expand their meta description tags, if they feel their current ones are adequate. … We now display slightly longer snippets, which means we might display more of a meta description tag.”

In short, if you are happy with the way your meta descriptions show to your searchers, then leave them. If you are not, you can try changing them. Either way, meta descriptions do not play a role in search rankings, they do play a role in what searchers see in the Google search results and can have an impact on your click-through rate from the Google search results.

Here is Google’s official statement on the snippets change:

The fact that our snippets have gotten longer doesn’t change the fundamentals of writing a description tag. They should generally inform and interest users with a short, relevant summary of what a particular page is about. We now display slightly longer snippets, which means we might display more of a meta description tag. However, we never had a limit on meta description tag length before, as we covered earlier this year. So, there’s no need for publishers to suddenly expand their meta description tags, if they feel their current ones are adequate. As a reminder, our snippets are dynamically generated. Sometimes, they use what’s in a meta description tag. More often, they are generated by showing content from the page itself and perhaps parts of the meta description tag, as is appropriate for individual queries. For more guidance on meta description tags and snippet generation, we recommend publishers read our recent blog post on the topic, our help page and the “Create good titles & snippets” section of our SEO starter guide that was just updated this week.

John Mueller from Google also commented about this in detail in a recent Google Hangout at the 29:41 mark in that Hangout. Here is what he said:

There’s a lot of talk about expanded and meta descriptions but people are split on whether or not SEO should update existing metas or let Google expand them for us. What’s your take?

So I saw some discussion around this I don’t know what what people have been discussing.

So in general what one of the things we’ve been experimenting with [is] showing longer descriptions in the search results and I believe that’s something that more and more people are seeing.

So for the descriptions that we show we try to focus on the meta description that you provide on your pages but if we need more information or more context based on the user’s query perhaps then we can take some parts of the page as well. Essentially from from a purely technical point of view these descriptions aren’t a ranking for anything. So it’s not the case that changing your descriptions or making them longer or shorter or tweaking them or putting keywords in there will affect your site’s ranking. However it can affect the way that users see your site in the search results and whether or not they actually click through to your site. So that’s kind of one one aspect there to keep in mind.

And with that aspect sometimes it does make sense to make sure that the description that you’re providing to search engines, that’s perhaps being shown to users when they search for normal things on your website. That description is something that explains what your service is where your page offers, maybe the the unique proposition that you have on your page. That kind of encourage[s] people to click through to your page that probably makes sense for a lot of cases. And sometimes it makes sense to say well I know how to describe this best, therefore I’ll write it up in the description and if Google can show this then my hope that people will see my site is being clearly superior to all other ones and click on my site rather than some of the other ones that are ranking in the same search results page.

So with that in mind. It’s not a ranking factor. It can affect how your site is visible in the search results. So with that I definitely see see it as something legitimate where you might say well I want to make sure that my my kind of proposition is out there in full and therefore I’ll try to write something a bit longer and show that in my meta description.

The one thing to kind of keep in mind there is that we adjust the description based on the user’s query. So if you’re doing a site query and seeing this in your search results for your site that’s not necessarily what a normal user would see when they see a search as well. So make sure to check in search console and search analytics what the top queries are that are leading to your pages and try those queries out see what your site search results look like. And if you want to change the snippet that’s shown for your site for individual pages on your site then by all means go off and do that.

So check out your analytics, look to see if you can improve your click-through rates on your popular pages in search and see if it makes a difference to your bottom line.

Google bringing the Assistant to tablets and Lollipop Android phones

Google is rolling out the Assistant to more devices. It will soon be available on Android tablets running Nougat and Marshmallow, and smartphones running Lollipop.

Tablets in the US running English will be the first to get access. However, a wide array of Android 5.0 smartphones (Lollipop) will get the Assistant: Those operating in English in major markets and in Spanish in the US, Mexico and Spain; and Lollipop smartphones in Italy, Japan, Germany, Brazil and Korea.

Google is pushing the Assistant out to more devices as the market becomes more competitive and AI development accelerates.

A July 2017 report from Verto Analytics found that 42 percent of US smartphone owners used virtual assistants, in the aggregate, on average 10 times per month. That translated into more than 70 million smartphone owners and almost 1 billion hours per month in the US. The numbers are likely somewhat higher now.

Personal Assistant Usage Numbers & Demographics

Source: Verto Analytics (5/17)

Siri was the most used (largest audience), but Cortana and Alexa were the fastest-growing assistants, according to Verto.

Separate research has found that virtual assistants are used much more frequently on smart speakers, which makes sense because of the general absence of screens: almost three uses per day vs. less than one for smartphones.

Google Search Console beta adds 12+ months of data to performance reports

The new beta version of Google Search Console has now added over 12 months of historical data to the performance reports.

Here is a screen shot showing the options of date filters for the report, including last seven days, last 28 days, last three months, last six months, last 12 months and full duration:

Glenn Gabe of G-Squared Interactive also is able to see it in his beta Google Search Console reports:

On the first day of Christmas, Google gave to me… *12 months of data in the new GSC*!!

OMG, here we go folks. I'm seeing 12 months of data in the Search Analytics beta. I asked and I've been told I can share this screenshot.

Happy Holidays to all SEOs. 🙂 pic.twitter.com/VhL5qsMlRW

— Glenn Gabe (@glenngabe) December 13, 2017

I suspect the “full duration” means Google will be showing even more than 12 months of data in these reports, although that is unconfirmed and unclear at this moment. We will keep you posted on Google’s answer to that question.

If you are part of the beta, you should be able to access this information now.

Google has been hinting it would be giving webmasters longer-term data since 2013, and now, a few years later, we’ve got it, at least in beta.

Visualizing your site structure in advance of a major change

In our last article, we looked at some interesting ways to visualize your website structure to illuminate how external links and PageRank flow through it. This time, we’re going to use the same tools, but we’re going to look instead at how a major site structure change might impact your site.

Search engine crawlers can determine which pages on your site are the most important, based, in part, on how your internal links are structured and organized. Pages that have a lot of internal links pointing to them — including links from the site’s navigation — are generally considered to be your most important pages. Though these are not always your highest-ranking pages, high internal PageRank often correlates with better search engine visibility.

Note: I use the phrase “internal PageRank,” coined by Paul Shapiro, to refer to the relative importance of each page within a single website based on that site’s internal linking structure. This term may be used interchangeably with “page weight.”

The technique I’ll outline below can be used to consider how internal PageRank will be impacted by the addition of new sections, major changes to global site navigation (as we’ll see below) and most major changes to site structure or internal linking.

Understanding how any major change to a site could potentially impact its search visibility is paramount to determining the risk vs. reward of its implementation. This is one of the techniques I’ve found most helpful in such situations, as it provides numbers we can reference to understand if (and how) page weight will be impacted by a structural adjustment.

In the example below, we’re going to assume you have access to a staging server, and that on that server you will host a copy of your site with the considered adjustments. In the absence of such a server, you can edit the spreadsheets manually to reflect the changes being considered. (However, to save time, it’s probably worth setting up a secondary hosting account for the tests and development.)

It’s worth noting that on the staging server, one need only mimic the structure and not the final design or content. Example: For a site that I’m working on, I considered removing a block of links in a drop-down from the global site navigation and replacing that block of links with a single text link. That link would go to a page containing the links that were previously in the drop-down menu.

When I implemented this site structure change on the staging server, I didn’t worry about whether any of this looked good — I simply created a new page with a big list of text links, removed all the links from the navigation drop-down, and replaced the drop-down with a single link to the new page.

I would never put this live, obviously — but my changes on the staging server mimic the site structure change being considered, giving me insight into what will happen to the internal PageRank distribution (as we’ll see below). I’ll leave it to the designers to make it look good.

For this process, we’re going to need three tools:

    Screaming Frog — The free version will do if your site is under 500 pages or you just want a rough idea of what the changes will mean.Gephi — A free, powerful data visualization tool.Google Analytics

So, let’s dive in…

Collecting your data

I don’t want to be redundant, so I’ll spare you re-reading about how to crawl and export your site data using Screaming Frog. If you missed the last piece, which explains this process in detail, you can find it here.

Once the crawl is complete and you have your site data, you need simply export the relevant data as follows:

Bulk Export > Response Codes > Success (2xx) Inlinks

You will do this for both your live site and your staging site (the one with the adjusted structure). Once you have downloaded both structures, you’ll need to format them for Gephi. All that Gephi needs to create a visualization is an understanding of your site pages (“nodes”) and the links between them (“edges”).

Note: Before we ready the data, I recommend doing a Find & Replace in the staging CSV file and replacing your staging server domain/IP with that of your actual site. This will make it easier to use and understand in future steps.

As Gephi doesn’t need a lot of the data from the Screaming Frog export, we’ll want to strip out what’s not necessary from these CSV files by doing the following:

Delete the first row containing “Success (2xx) Inlinks.”Rename the “Destination” column “Target.”Delete all other columns besides “Source” and “Target.” (Note: Before deleting it, you may want to do a quick Sort by the Type column and remove anything that isn’t labeled as “AHREF” — CSS, JS, IMG and so on — to avoid contaminating your visualization.)Save the edited file. You can name it whatever you’d like. I tend to use domain-live.csv and domain-staging.csv.

The third set of data we’ll want to have is an Export of our organic landing pages from Google Analytics. You can use different metrics, but I’ve found it extremely helpful to have a visual of which pages are most responsible for my organic traffic when considering the impact of a structural change on page weight. Essentially, if you find that a page responsible for a good deal of your traffic will suffer a reduction in internal PageRank, you will want to know this and adjust accordingly.

To get this information into the graph, simply log into Google Analytics, and in the left-hand navigation under “Behavior,” go to “Site Content” and select “Landing Pages.” In your segments at the top of the page, remove “All Users” and replace it with “Organic Traffic.” This will restrict your landing page data to only your organic visitors.

Expand the data to include as many rows as you’d like (up to 5,000) and then Export your data to a CSV, which will give you something like:

Remove the first six rows so your heading row begins with the “Landing Page” label. Then, scroll to the bottom and remove the accumulated totals (the last row below the pages), as well as the “Day Index” and “Sessions” data.

Note that you’ll need the Landing Page URLs in this spreadsheet to be in the same format as the Source URLs in your Screaming Frog CSV files. In the example shown above, the URLs in the Landing Page column are missing the protocol (https) and subdomain (www), so I would need to use a Find & Replace to add this information.

Now we’re ready to go.

Getting a visualization of your current site

The first step is getting your current site page map uploaded — that is, letting Gephi know which pages you have and what they link to.

To begin, open Gephi and go to File > Import Spreadsheet.  You’ll select the live site Screaming Frog export (in my case, yoursite-live.csv) and make sure the “As table:” drop-down is set to “Edges table.”

On the next screen, make sure you’ve checked “Create missing nodes,” which will tell Gephi to create nodes (read: pages) for the “Edges table” (read: link map) that you’ve entered. And now you’ve got your graph. Isn’t it helpful?

OK, not really — but it will be. The next step is to get that Google Analytics data in there. So let’s head over to the Data Laboratory (among the top buttons) and do that.

First, we need to export our page data. When you’re in the Data Laboratory, make sure you’re looking at the Nodes data and Export it.

When you open the CSV, it should have the following columns:

Id (which contains your page URLs)LabelTimeset

You’ll add a fourth column with the data you want to pull in from Google Analytics, which in our case will be “Sessions.” You’ll need to temporarily add a second sheet to the CSV and name it “analytics,” where you’ll copy the data from your analytics export earlier (essentially just moving it into this Workbook).

Now, what we want to do is fill the Sessions column with the actual session data from analytics. To do this, we need a formula that will look through the node Ids in sheet one and look for the corresponding landing page URL in sheet two; when it finds it, it should insert the organic traffic sessions for that page into the Sessions column where appropriate.

Probably my most-used Excel script does the trick here. In the top cell of the “Sessions” column you created, enter the following (the bolded numbers will change based on the number of rows of data you have in your analytics export).

=IFERROR(INDEX(analytics!$B$2:$B$236,MATCH(A2,analytics!$A$2:$A$236,0),1),”0″)

Once completed, you’ll want to copy the Sessions column and use the “Paste Values” command, which will switch the cells from containing a formula to containing a value.

All that’s left now is to re-import the new sheet back into Gephi. Save the spreadsheet as something like data-laboratory-export.csv (or just nodes.csv if you prefer). Using the Import feature from in the Data Laboratory, you can re-import the file, which now includes the session data.

Now, let’s switch from the Data Laboratory tab back to the Overview tab. Presently, it looks virtually identical to what it had previously — but that’s about to change. First, let’s apply some internal PageRank. Fortunately, a PageRank feature is built right into Gephi based on the calculations of the initial Google patents. It’s not perfect, but it’s pretty good for giving you an idea of what your internal page weight flow is doing.

To accomplish this, simply click the “Run” button beside “PageRank” in the right-hand panel. You can leave all the defaults as they are.

The next thing you’ll want to do is color-code the nodes (which represent your site pages) based on the number of sessions and size them based on their PageRank. To do this, simply select the color palette for the nodes under the “Appearance” pane to the upper left. Select sessions from the drop-down and choose a palette you like. Once you’ve chosen your settings, click “Apply.”

Next, we’ll do the same for PageRank, except we’ll be adjusting size rather than color. Select the sizing tool, choose PageRank from the drop-down, and select the maximum and minimum sizes (this will be a relative sizing based on page weight). I generally start with 10 and 30, respectively, but you might want to play around with them. Once you’ve chosen your desired settings, click “Apply.”

The final step of the visualization is to select a layout in the bottom left panel. I like “Force Atlas” for this purpose, but feel free to try them all out. This gives us a picture that looks something like the following:

You can easily reference which pages have no organic traffic and which have the most based on their color — and by right-clicking them, you can view them directly in the Data Laboratory to get their internal PageRank. (In this instance, we can learn one of the highest traffic pages is a product page with a PageRank of 0.016629.) We can also see how our most-trafficked pages tend to be clustered towards the center, meaning they’re heavily linked within the site.

Now, let’s see what happens with the new structure. You’ll want to go through the same steps above, but with the Screaming Frog export from the staging server (in my case, domain-staging.csv). I’m not going to go make you read through all the same steps, but here’s what the final result looks like:

We can see that there are a lot more outliers in this version (pages that have generally been significantly reduced in their internal links). We can investigate which pages those are by right-clicking them and viewing them in the Data Laboratory, which will help us locate possible unexpected problems.

We also have the opportunity to see what happened to that high-traffic product page mentioned above. In this case, under the new structure, its internal PageRank shifted to 0.02171 — in other words, it got stronger.

There are two things that may have caused this internal PageRank increase: an increase in the number of links to the page, or a drop in the number of links to other pages.

At its core, a page can be considered as having 100 percent of its PageRank. Notwithstanding considerations like Google reduction in PageRank with each link or weighting by position on the page, PageRank flows to other pages via links, and that “link juice” is split among the links. So, if there are 10 links on a page, each will get 10 percent. If you drop the total number of links to five, then each will get 20 percent.

Again, this is a fairly simplified explanation, but these increases (or decreases) are what we want to measure to understand how a proposed site structure change will impact the internal PageRank of our most valuable organic pages.

Over in the Data Laboratory, we can also order pages by their PageRank and compare results (or just see how our current structure is working out).

And…

This is just the tip of the iceberg. We can substitute organic sessions for rankings in the page-based data we import (or go crazy and include both). With this data, we can judge what might happen to the PageRank of ranking (or up-and-coming) pages in a site structure shift. Or what about factoring in incoming link weight, as we did in the last article, to see how its passing is impacted?

While no tool or technique can give you 100 percent assurance that a structural change will always go as planned, this technique assists in catching many unexpected issues. (Remember: Look to those outliers!)

This exercise can also help surface unexpected opportunities by isolating pages that will gain page weight as a result of a proposed site structure change. You may wish to (re)optimize these pages before your site structure change goes live so you can improve their chances of getting a rankings boost.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

Compare 13 leading SEO platforms

SEO software comes in many shapes and sizes, from rank-checking tools and keyword research toolsets to full-service solutions that manage keywords, links, competitive intelligence, international rankings, social signal integration and workflow rights and roles.

How do you decide which one is right for your organization?

MarTech Today’s “Enterprise SEO Platforms: A Marketer’s Guide” examines the market for SEO platforms and the considerations involved in implementing this software into your business.

This 42-page report includes profiles of 13 leading SEO tools, vendors, pricing information, capabilities comparisons and recommended steps for evaluating and purchasing.

Visit Digital Marketing Depot to download “Enterprise SEO Platforms: A Marketer’s Guide.”

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.