JavaScript Instructions
JavaScript Instructions allow you to automate complex interactions with web pages before scraping their content. This powerful feature enables you to click buttons, fill forms, navigate through dynamic content, and perform virtually any action a human user would do.
Overview
The javascriptInstruction
parameter accepts an array of actions that will be executed in sequence on the target webpage. Each instruction is executed after the page loads and before the final content is captured.
Basic Syntax
{
"url": "https://example.com",
"renderJs": true,
"javascriptInstruction": [
{
"action": "clickElement",
"selector": {
"type": "css",
"value": "#load-more-button"
}
}
]
}
Available Actions
Navigation Actions
navigateToUrl
Navigate to a different URL within the same session.
{
"action": "navigateToUrl",
"url": "https://example.com/login",
"waitUntil": "networkidle0"
}
Parameters:
url
(string): Target URL to navigate towaitUntil
(string): When to consider navigation complete"load"
: Wait for page load event"domcontentloaded"
: Wait for DOM content loaded"networkidle0"
: Wait until no network activity
reloadPage
Reload the current page.
{
"action": "reloadPage",
"waitUntil": "load"
}
Element Interaction Actions
clickElement
Click on an element.
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".load-more-btn",
"state": "visible"
},
"timeout": 10
}
Parameters:
selector
(object): Element selector configurationtimeout
(number): Maximum wait time in secondsonError
(string): Error handling behavior ("continue"
or"return"
)
doubleClickElement
Double-click on an element.
{
"action": "doubleClickElement",
"selector": {
"type": "css",
"value": "#item-to-select"
}
}
hoverOverElement
Hover over an element to trigger hover effects.
{
"action": "hoverOverElement",
"selector": {
"type": "css",
"value": ".dropdown-trigger"
}
}
Text Input Actions
typeText
Type text into an input field.
{
"action": "typeText",
"selector": {
"type": "css",
"value": "#search-input"
},
"text": "web scraping API",
"delay": 100
}
Parameters:
text
(string): Text to typedelay
(number): Delay between keystrokes in milliseconds
performSearch
Specialized action for search forms.
{
"action": "performSearch",
"selector": {
"type": "css",
"value": "#search-form"
},
"text": "search query"
}
Form Actions
selectOption
Select an option from a dropdown or select element.
{
"action": "selectOption",
"selector": {
"type": "css",
"value": "#country-select"
},
"text": "United States"
}
Keyboard Actions
pressKeyboardKey
Press a specific keyboard key.
{
"action": "pressKeyboardKey",
"key": "Enter"
}
Common keys:
"Enter"
"Tab"
"Escape"
"ArrowDown"
"ArrowUp"
"Backspace"
Scrolling Actions
scrollDown
Scroll down the page.
{
"action": "scrollDown",
"top": 1000,
"left": 0
}
Parameters:
top
(number): Vertical scroll distanceleft
(number): Horizontal scroll distance
scrollToElement
Scroll to a specific element.
{
"action": "scrollToElement",
"selector": {
"type": "css",
"value": "#target-section"
}
}
Visibility Actions
hideElement
Hide an element (useful for removing popups or overlays).
{
"action": "hideElement",
"selector": {
"type": "css",
"value": ".popup-overlay"
}
}
Wait Actions
wait
Wait for a specified amount of time.
{
"action": "wait",
"delay": 3000
}
Parameters:
delay
(number): Wait time in milliseconds
waitForSelector
Wait for an element to appear.
{
"action": "waitForSelector",
"selector": {
"type": "css",
"value": ".dynamic-content",
"state": "visible"
},
"timeout": 15
}
waitForNavigation
Wait for page navigation to complete.
{
"action": "waitForNavigation",
"waitUntil": "networkidle0",
"timeout": 30
}
Advanced Actions
executeJavaScript
Execute custom JavaScript code.
{
"action": "executeJavaScript",
"source": "document.querySelector('.cookie-banner').style.display = 'none';"
}
executeBrowserScript
Execute JavaScript in the browser context with return value.
{
"action": "executeBrowserScript",
"source": "return document.title;"
}
setLocation
Set geolocation coordinates.
{
"action": "setLocation",
"latitude": 40.7128,
"longitude": -74.0060
}
Selector Types
CSS Selectors
Most common selector type using CSS syntax.
{
"type": "css",
"value": ".class-name",
"state": "visible"
}
XPath Selectors
Use XPath expressions for more complex element selection.
{
"type": "xpath",
"value": "//button[contains(text(), 'Load More')]",
"state": "attached"
}
Selector States:
"visible"
: Element is visible on the page"hidden"
: Element exists but is hidden"attached"
: Element is attached to the DOM
Error Handling
Control how errors are handled during instruction execution:
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".optional-button"
},
"onError": "continue"
}
Error Handling Options:
"continue"
: Continue execution if this action fails"return"
: Stop execution and return current page state
Real-World Examples
E-commerce Product Scraping
Scrape product details from a page that requires clicking "Show More" to reveal full description:
{
"url": "https://shop.example.com/product/123",
"renderJs": true,
"javascriptInstruction": [
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".show-more-description"
},
"onError": "continue"
},
{
"action": "wait",
"delay": 2000
},
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".show-all-reviews"
},
"onError": "continue"
}
]
}
Form Submission and Result Scraping
Fill out a search form and scrape the results:
{
"url": "https://example.com/search",
"renderJs": true,
"javascriptInstruction": [
{
"action": "typeText",
"selector": {
"type": "css",
"value": "#search-input"
},
"text": "web scraping"
},
{
"action": "selectOption",
"selector": {
"type": "css",
"value": "#category-select"
},
"text": "Technology"
},
{
"action": "clickElement",
"selector": {
"type": "css",
"value": "#search-button"
}
},
{
"action": "waitForSelector",
"selector": {
"type": "css",
"value": ".search-results",
"state": "visible"
},
"timeout": 10
}
]
}
Infinite Scroll Content Loading
Load all content from an infinite scroll page:
{
"url": "https://example.com/feed",
"renderJs": true,
"javascriptInstruction": [
{
"action": "executeJavaScript",
"source": "window.loadedItems = document.querySelectorAll('.item').length;"
},
{
"action": "scrollDown",
"top": 3000
},
{
"action": "wait",
"delay": 3000
},
{
"action": "scrollDown",
"top": 3000
},
{
"action": "wait",
"delay": 3000
},
{
"action": "scrollDown",
"top": 3000
},
{
"action": "waitForSelector",
"selector": {
"type": "css",
"value": ".end-of-content"
},
"timeout": 5,
"onError": "continue"
}
]
}
Login and Authenticated Content
Scrape content that requires login:
{
"url": "https://example.com/login",
"renderJs": true,
"javascriptInstruction": [
{
"action": "typeText",
"selector": {
"type": "css",
"value": "#username"
},
"text": "your-username"
},
{
"action": "typeText",
"selector": {
"type": "css",
"value": "#password"
},
"text": "your-password"
},
{
"action": "clickElement",
"selector": {
"type": "css",
"value": "#login-button"
}
},
{
"action": "waitForNavigation",
"waitUntil": "networkidle0",
"timeout": 10
},
{
"action": "navigateToUrl",
"url": "https://example.com/protected-content"
}
]
}
Handling Cookie Banners and Popups
Remove common obstructions before scraping:
{
"url": "https://example.com",
"renderJs": true,
"javascriptInstruction": [
{
"action": "wait",
"delay": 2000
},
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".cookie-accept-button"
},
"onError": "continue"
},
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".popup-close"
},
"onError": "continue"
},
{
"action": "hideElement",
"selector": {
"type": "css",
"value": ".newsletter-overlay"
},
"onError": "continue"
}
]
}
Best Practices
1. Use Appropriate Timeouts
Set reasonable timeout values for actions that might take time:
{
"action": "waitForSelector",
"selector": {
"type": "css",
"value": ".slow-loading-content"
},
"timeout": 15
}
2. Handle Errors Gracefully
Use onError: "continue"
for optional actions:
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".optional-popup-close"
},
"onError": "continue"
}
3. Add Strategic Waits
Allow time for dynamic content to load:
{
"action": "clickElement",
"selector": {
"type": "css",
"value": ".load-more"
}
},
{
"action": "wait",
"delay": 3000
}
4. Use Specific Selectors
Prefer specific CSS selectors over generic ones:
// ✅ Good
{
"type": "css",
"value": "#main-content .product-title"
}
// ❌ Avoid
{
"type": "css",
"value": "div"
}
5. Minimize JavaScript Execution
Keep custom JavaScript simple and focused:
{
"action": "executeJavaScript",
"source": "document.querySelector('.overlay').remove();"
}
Troubleshooting
Common Issues
- Element Not Found: Ensure selectors are correct and elements exist
- Timing Issues: Add appropriate waits after actions
- Dynamic Content: Use
waitForSelector
for dynamically loaded elements - JavaScript Errors: Test custom JavaScript in browser console first
Debugging Tips
- Start with simple actions and build complexity gradually
- Use
wait
actions to observe page state changes - Test selectors in browser developer tools
- Use
onError: "continue"
during development to avoid stopping execution
Next: Continue exploring our documentation to learn more about advanced scraping techniques and data extraction methods.