Were you looking for a handy way to scrape, collect and order any data from your website? Our brand new feature lets you extract any piece of content from your website. Build your own filters with our custom fields and find them directly into your Data Explorer.
Our ‘Custom Fields’ feature offers different actionable use cases:
Possibilities are unlimited and the use cases above are just a few examples.
Our Custom Fields can be set in your crawl settings:
We support two kinds of expression : either basic regular expression (see the guide) or XPath expression (see the guide). That choice is important because it influences the way the rule will be treated.
Sample: <meta itemprop=”ratingValue” content=”4.5″>
Rules : <meta itemprop=”ratingValue” content=”([0-9]+(\.[0-9]*)?)“>
Output : 4.5
Sample: <meta itemprop=”ratingValue” content=”4.5″>
Rules: string(//meta[@itemprop=’ratingValue’]/@content)
Output : 4.5
This extraction is perfect to extract a product’s price or note
This one can be used to extract a list of similar products
This type of extraction is well suited to check traffic analytics or advertisers tags
This rule is perfect to count the number of comments on an article or the number of ads in a page.
Field formats are important because they enable query operators in our OQL (OnCrawl Query Language) as well as sorting values in the Data Explorer tables.
Note: depending on the type of extraction, this choice is disabled: ‘Check if exists’ enforces the field to be a boolean field, ‘Length’ and ‘Number of occurrences’ both enforce the field to be an integer field.
Sample: <strong class=”product-price”>249<sup>€99</sup></strong>
Rules : <strong[^>]+>\s*([0-9]+)€([0-9]+)\s*
Field format : Formatted value
Formatted value : {0}.{1}€
Output : 249.99€
You need to add a name to your newly created fields to easily find them in the Data Explorer.
You can directly test the rule by hitting the ‘Check’ button with a sample of different pages or by copying a piece of HTML code to make sure everything works as expected.
Then, go to your Data Explorer, click on ‘add columns’ and select the Custom Field you have created.
Your can also directly sort your URLs by Custom Filters. Select ‘Set your filter’ and the Custom Field you’ve just created. Then, define your query (‘True’ or ‘False’ here) and hit ‘Apply Filters’.
Your URLs are then only sorted with the requested Custom Fields:
You are now ready to play with your new filters!
Our Custom Fields are available from Pro plan as an option. Want to try them?
And contact us to activate your Custom Fields.