{"id":62,"date":"2022-01-21T10:00:00","date_gmt":"2022-01-21T10:00:00","guid":{"rendered":"http:\/\/localhost\/ukwebgeekz.com\/?p=62"},"modified":"2022-01-21T16:09:20","modified_gmt":"2022-01-21T16:09:20","slug":"custom-extraction-screaming-frog","status":"publish","type":"blog","link":"https:\/\/www.ukwebgeekz.com\/blog\/custom-extraction-screaming-frog\/","title":{"rendered":"Custom Extraction Methods & Screaming Frog"},"content":{"rendered":"\n

It should not need saying that a website is essential<\/a> these days for any business to succeed. With literally billions of potential customers online, businesses simply can\u2019t afford to miss out. In 2018, global \u2018e-retail\u2019 sales totalled something in the region of $2.8 trillion. The projected figure for 2021 is nearer to $4.8 trillion. <\/p>\n\n\n\n

Competition for online business is fierce, as you might expect. The companies with the best SEO strategies<\/a> are clearly going to win. Those who don\u2019t invest in SEO<\/a>, or who neglect their websites, will suffer \u2013 it\u2019s as simple as that. <\/p>\n\n\n\n

You might have excellent SEO<\/a> and a great website. But how do you give your business the edge over your competition? There are many tools and tricks out there to boost your chances and help you maximise opportunities, and custom extraction is one.<\/p>\n\n\n\n

What is Custom Extraction?<\/h2>\n\n\n\n

First of all, let\u2019s think back to the average office about fifty years ago. Immediately, you\u2019d note the absence of computers in most offices, and where they were present they weren\u2019t exactly \u2018user friendly\u2019 or powerful \u2013 compared with today\u2019s models. When data was required, it would mostly be left to humans to gather. It was often a painstaking, laborious, and thankless task. And then there is the fact of \u2018human error\u2019. Yes, reports were made, and frequently made well. But it took time and effort that often exceeded its usefulness. As time went on, and more data became available, it became less worthwhile. The internet wasn\u2019t around at the time, so you couldn\u2019t dive in to extract the data you required.<\/p>\n\n\n\n

Jump forward to today, and we have information at our fingertips 24 hours a day. In fact, we have vast, overwhelming amounts of information and data. In a sense, we are faced with a similar problem to our counterparts of half a century ago; sifting through masses of data to find anything of value is not a good use of our time. And people still make mistakes at times. This is where custom extraction can be a useful tool, especially Screaming Frog which is our favorite. Special software is put into action that scours websites and grabs the data you need. It then saves it in a file for you to analyse and present in a range of formats.<\/p>\n\n\n\n

And this can save you resources, time, and money.<\/p>\n\n\n\n

How does it work?<\/h2>\n\n\n\n

If you\u2019re not familiar with the technical terms, it can all get a bit confusing. Basically, all websites are built using a code called HTML. Other codes are used in the design and presentation, but HTML is the basic framework that all internet browsers recognise. If you were to look at the actual code for a website, you might be amazed at just how much is there. If you were to search through it to find a specific bit of information, it could take a while.<\/p>\n\n\n\n

If you\u2019re not familiar with the technical terms, it can all get a bit confusing. Basically, all websites are built using a code called HTML. Other codes are used in the design and presentation, but HTML is the basic framework that all internet browsers recognise. If you were to look at the actual code for a website, you might be amazed at just how much is there. If you were to search through it to find a specific bit of information, it could take a while.<\/p>\n\n\n\n

Using a \u2018web scraper\u2019 you can find that information quickly. These scrapers are also sometimes referred to as \u2018bots\u2019, \u2018spiders\u2019, or \u2018spiderbots\u2019 and are the same type of tools used by search engines to scour the millions of pages that make up the internet in order to index relevant sites for any particular search.<\/p>\n\n\n\n

Using a \u2018web scraper\u2019 you can find that information quickly. These scrapers are also sometimes referred to as \u2018bots\u2019, \u2018spiders\u2019, or \u2018spiderbots\u2019 and are the same type of tools used by search engines to scour the millions of pages that make up the internet in order to index relevant sites for any particular search.<\/p>\n\n\n\n

Web data extraction is growing in popularity as businesses are waking up to its importance. Custom data extraction takes this a step further by speeding up the process and automating it. You can download software like Screaming Frog<\/a> to perform this for you, set the required parameters, and wait for the results.<\/p>\n\n\n\n

And these results can seriously boost your company\u2019s performance or replace a time consuming method with a fast efficient data export.<\/p>\n\n\n\n

An example of custom extraction for stock checks.<\/h2>\n\n\n\n

A lot of large online stores not only sell their own products but those of small or large suppliers that may not have high tech logistics or even a stock list at all. This can be a problem with customers on your site buying a product that you then cannot fulfill.<\/p>\n\n\n\n

So how can we fix this?<\/h3>\n\n\n\n

You need to have a little bit of code knowledge specifically HTML & CSS but it really is quite simple. Although the code\/method you use might differ slightly on different websites due to the structure or validity of the code, or indeed because its behind a login\/wall.<\/p>\n\n\n\n

You will need a tool like XPather<\/a> to inspect the code on the page you want to extract.<\/p>\n\n\n\n

For example:<\/h4>\n\n\n\n

A TV site might have some items listed as \u201cout of stock\u201d or \u201climited stock\u201d or \u201cdue back in October\u201d etc along with its product code\/identifier on the same page. We can use XPather<\/strong> to give us the CSS path, Text Path or Inner HTML path of those values.<\/p>\n\n\n\n

It will look something like this<\/h4>\n\n\n\n
\/\/div[@id='product-information']\/p[1]<\/code><\/pre>\n\n\n\n
\/\/div[@id='product-details']\/p[1]<\/code><\/pre>\n\n\n\n

These 2 lines of code point to the exact path in the back-end code of the web page that you the visitor can see. We can enter these into the \u201ccustom extraction\u201d tool within Screaming Frog and set the bot running (you can define a speed which if too fast is not ethical to the site you are crawling so be careful but this would need to be covered in another post)<\/p>\n\n\n\n

\"Custom<\/figure>\n\n\n\n
An example of the custom extraction being added<\/em><\/figcaption>\n\n\n\n

Once Screaming Frog has finished crawling the site (you can check its progress in the custom extraction tab) you can export the data to a spreadsheet. In this case it would be one column of product codes and in the next column its stock status. We can then use this information to do a mass export\/import on the eCommerce site to hide out of stock items or un-hide back in stock items. This was a savior for one of our clients who at the time were manually checking or noticing too late after orders were made.<\/p>\n\n\n\n

What are the benefits of Custom Extraction?<\/h2>\n\n\n\n

Here are a few examples of how this software \u2013 and the data it produces- can help:<\/p>\n\n\n\n

Website \/ SEO audits<\/h4>\n\n\n\n

Your website can hide many technical errors that the human eye may not see which is even more prevalent on large sites with 1000`s of pages. With these tools you can quickly gather all of the data you need to make it easier to spot technical SEO errors.<\/p>\n\n\n\n

Keywords<\/h4>\n\n\n\n

Find the most suitable keywords for your business of product and see which keywords your competitors are using. Compile a chart to see which are ranking highest.<\/p>\n\n\n\n

Research<\/h4>\n\n\n\n

Regularly keep an eye on your competitors through an automated system, giving you information on their finances, funding, pricing, new developments, technology, and so on. Also, the \u2018web scraping\u2019 software can access content that is usually only available through downloading or creating an account.<\/p>\n\n\n\n

Monitoring news sites<\/h4>\n\n\n\n

Some businesses operate in markets that are affected by events around the globe. This software can automatically gather data that helps them take timely and appropriate action \u2013 possibly before their rivals.<\/p>\n\n\n\n

Locating a target audience<\/h4>\n\n\n\n

Custom extraction tools can search social media to discover general trends and to see which products are in demand. They can also explore existing customers to determine satisfaction levels. Both methods help to direct future marketing campaigns, seize opportunities, and maintain customer loyalty.<\/p>\n\n\n\n

Brand image and awareness<\/h4>\n\n\n\n

Image is everything. The World Wide Web gives virtually everyone the opportunity to express their opinion on anything, including your business. Through web data extraction you can find out exactly what people think. Then you can use this data to maintain that image, or take steps to improve it, tracking changes through new reports.<\/p>\n\n\n\n

Track Stock Volumes<\/h4>\n\n\n\n

By performing searches on your own pages, you can keep track of stock levels. One of the most frustrating aspects of online shopping is the message \u2018out of stock\u2019. This can help out eCommerce companies massively if they are a distributor of suppliers products and do not have up to date stock lists from said supplier. (we have actually used this for a client it worked a treat during the busy Covid period!)<\/p>\n\n\n\n

Keep your website updated<\/h4>\n\n\n\n

In cases of \u2018time-sensitive\u2019 events, like Black Friday, you need to quickly locate instances where wording and details need to be adjusted. Data extraction software keeps track of these locations to ensure they are updated.<\/p>\n\n\n\n

Pricing strategy<\/h4>\n\n\n\n

Instantly keep abreast of fluctuations in prices for service and goods offered by your competitors. Analyze their products, rates, and features.<\/p>\n\n\n\n

The benefits and advantages stretch way beyond this list, but it serves as an example of how useful custom extraction can be. Installing this software makes good sense for any business that is serious about SEO.<\/p>\n\n\n\n

Based in the North East of England UK Web GeekZ<\/a> are innovators in the search engine optimisation<\/a> industry and use tactics that work time and time again, give us a call if you would like to get more traffic or simply a new website design.<\/p>\n\n\n\n