Bright Data

CommunityBYOC

Search, Crawl and Scrape any site, at scale, without getting blocked

Author:Arcade

Version:0.5.0

Auth:No authentication required

3tools

3require secrets

Bright Data provides a developer toolkit for large-scale web search, crawling, and scraping, enabling reliable extraction of pages and structured data without getting blocked. It supports search queries, content-to-Markdown conversion, and configurable data feeds across many site types.

Designed for integration into data pipelines and analytics workflows with parameterized feeds and output formats.

Capabilities

Scale-resistant crawling and scraping with anti-blocking behavior for sustained collection.
Flexible search engine queries with advanced parameters across major engines.
Transform pages into clean Markdown and emit structured JSON feeds for profiles, products, reviews, listings, and media.
Configurable extraction parameters for batching, pagination, and media handling.

Secrets

API key (BRIGHTDATA_API_KEY) and zone token (BRIGHTDATA_ZONE). Example values: BRIGHTDATA_API_KEY=sk_..., BRIGHTDATA_ZONE=zone123.

Available tools(3)

3 of 3

Tool name	Description	Secrets
Brightdata.ScrapeAsMarkdown	Scrape a webpage and return content in Markdown format using Bright Data. Examples: scrape_as_markdown("https://example.com") -> "# Example Page Content..." scrape_as_markdown("https://news.ycombinator.com") -> "# Hacker News ..."	2
Brightdata.SearchEngine	Search using Google, Bing, or Yandex with advanced parameters using Bright Data. Examples: search_engine("climate change") -> "# Search Results ## Climate Change - Wikipedia ..." search_engine("Python tutorials", engine="bing", num_results=5) -> "# Bing Results ..." search_engine("cats", search_type="images", country_code="us") -> "# Image Results ..."	2
Brightdata.WebDataFeed	Extract structured data from various websites like LinkedIn, Amazon, Instagram, etc. NEVER MADE UP LINKS - IF LINKS ARE NEEDED, EXECUTE search_engine FIRST. Supported source types: - amazon_product, amazon_product_reviews - linkedin_person_profile, linkedin_company_profile - zoominfo_company_profile - instagram_profiles, instagram_posts, instagram_reels, instagram_comments - facebook_posts, facebook_marketplace_listings, facebook_company_reviews - x_posts - zillow_properties_listing - booking_hotel_listings - youtube_videos Examples: web_data_feed("amazon_product", "https://amazon.com/dp/B08N5WRWNW") -> "{"title": "Product Name", ...}" web_data_feed("linkedin_person_profile", "https://linkedin.com/in/johndoe") -> "{"name": "John Doe", ...}" web_data_feed( "facebook_company_reviews", "https://facebook.com/company", num_of_reviews=50 ) -> "[{"review": "...", ...}]"	1

Selected tools

No tools selected.

Click "Show all tools" to add tools.

Requirements

Select tools to see requirements

Brightdata.ScrapeAsMarkdown

Add to selected tools

Scrape a webpage and return content in Markdown format using Bright Data. Examples: scrape_as_markdown("https://example.com") -> "# Example Page Content..." scrape_as_markdown("https://news.ycombinator.com") -> "# Hacker News ..."

Parameters

Parameter	Type	Req.	Description
`url`	`string`	Required	URL to scrape

Requirements

Secrets:BRIGHTDATA_API_KEYBRIGHTDATA_ZONE

Output

Type:string— Scraped webpage content as Markdown

Brightdata.SearchEngine

Add to selected tools

Search using Google, Bing, or Yandex with advanced parameters using Bright Data. Examples: search_engine("climate change") -> "# Search Results ## Climate Change - Wikipedia ..." search_engine("Python tutorials", engine="bing", num_results=5) -> "# Bing Results ..." search_engine("cats", search_type="images", country_code="us") -> "# Image Results ..."

Parameters

Parameter	Type	Req.	Description
`query`	`string`	Required	Search query
`engine`	`string`	Optional	Search engine to use `googlebingyandex`
`language`	`string`	Optional	Two-letter language code
`country_code`	`string`	Optional	Two-letter country code
`search_type`	`string`	Optional	Type of search `imagesshoppingnewsjobs`
`start`	`integer`	Optional	Results pagination offset
`num_results`	`integer`	Optional	Number of results to return. The default is 10
`location`	`string`	Optional	Location for search results
`device`	`string`	Optional	Device type `mobileiosiphoneipadandroidandroid_tablet`
`return_json`	`boolean`	Optional	Return JSON instead of Markdown

Requirements

Secrets:BRIGHTDATA_API_KEYBRIGHTDATA_ZONE

Output

Type:string— Search results as Markdown or JSON

Brightdata.WebDataFeed

Add to selected tools

Extract structured data from various websites like LinkedIn, Amazon, Instagram, etc. NEVER MADE UP LINKS - IF LINKS ARE NEEDED, EXECUTE search_engine FIRST. Supported source types: - amazon_product, amazon_product_reviews - linkedin_person_profile, linkedin_company_profile - zoominfo_company_profile - instagram_profiles, instagram_posts, instagram_reels, instagram_comments - facebook_posts, facebook_marketplace_listings, facebook_company_reviews - x_posts - zillow_properties_listing - booking_hotel_listings - youtube_videos Examples: web_data_feed("amazon_product", "https://amazon.com/dp/B08N5WRWNW") -> "{"title": "Product Name", ...}" web_data_feed("linkedin_person_profile", "https://linkedin.com/in/johndoe") -> "{"name": "John Doe", ...}" web_data_feed( "facebook_company_reviews", "https://facebook.com/company", num_of_reviews=50 ) -> "[{"review": "...", ...}]"

Parameters

Parameter	Type	Req.	Description
`source_type`	`string`	Required	Type of data source `amazon_productamazon_product_reviewslinkedin_person_profilelinkedin_company_profilezoominfo_company_profileinstagram_profilesinstagram_postsinstagram_reelsinstagram_commentsfacebook_postsfacebook_marketplace_listingsfacebook_company_reviewsx_postszillow_properties_listingbooking_hotel_listingsyoutube_videos`
`url`	`string`	Required	URL of the web resource to extract data from
`num_of_reviews`	`integer`	Optional	Number of reviews to retrieve. Only applicable for facebook_company_reviews. Default is None
`timeout`	`integer`	Optional	Maximum time in seconds to wait for data retrieval
`polling_interval`	`integer`	Optional	Time in seconds between polling attempts

Requirements

Secrets:BRIGHTDATA_API_KEY

Output

Type:string— Structured data from the requested source as JSON

Get Building

Use tools hosted on Arcade Cloud

Arcade tools are hosted by our cloud platform and ready to be used in your agents. Learn how.

Learn more

Self Host Arcade tools

Arcade tools can be self-hosted on your own infrastructure. Learn more about self-hosting.

pip install arcade_brightdata

Learn more