26
Feb

Scraping Yelp verified Attractions Data

Posted By admin
data mining

Project Title: Scraping Yelp verified Attractions Data

Project Description:
Objective: Retrieve all the basic listings data for a subset of categories verified from Yelp for each city, in each state, in the US.

Steps:
1. Go to http://www.yelp.com/locations/states

2. Click into a state (e.g. California)
a. This brings you to a page with a list of all cities in the state.
b. We would like to crawl each city listed

3. Click into a city (e.g. Danville)
a. Once on this page, there is a heading that says “Best of Yelp”

4. Click on the button on the right side labeled “See More” (shown below)
a. This brings you to a page with a section with a rectangular block of pictures.

5. At the bottom right of this block of pictures, click a button labeled “More Restaurants”
a. This page should have a list of categories and filters, followed by a listing of businesses
b. On the top left there should be a breadcrumb feature that says “Businesses > Restaurants”

6. Click on the portion of the breadcrumb that say “Businesses”
a. The page should then refresh and the breadcrumb should disappear
b. Underneath the city name, there should be a list of categories
c. We would like to crawl only the categories “Active Life” and “Arts & Entertainment”, and “Hotels &Travel”

7. Click on one of the categories (only “Active Life”, “Arts & Entertainment”, or “Hotels &Travel”)
a. The page should refresh again and the breadcrumb should reappear
b. Underneath this section, there are filters under the labels of “Sort By”, “Cities”, “Distance”, and “Features”

8. Under the “Distance” label click “Bird’s-eye View” (this allows us to show all listings in that city since otherwise, it may default to 2-mile radius or something else that only shows a subset from that city.)
a. The page will refresh again with new listings
b. We want to crawl every listing on this page
i. Sometimes there will be listings that are located outside the city originally clicked on in Step 4. We want to compare the city of each attraction on that list with the name of the city we’re crawling to exclude anything that’s not in that city. Please do not include listings that are outside of the original location since they will be duplicates
c. Note: The listings are paginated

9. Click into the name of a listing
a. This is where the data will be pulled. On the next page, you will find a list of all the data that we want plus an annotated screenshot for guidance
b. Note: not all data will be available for many attractions. Some of the attractions display nothing but the name of the attraction

For similar work requirement feel free to email us on info@webscrapingexpert.com.

Comments
  • 2 years ago Joyce Carroll

    Help me build a data base of few business categories from directory website: http://www.yellowpages.com. More information will be given once we receive initial response from your end.

    Reply
  • 2 years ago Jane Kelley

    Our requirement is to scrape business listings of Spain from Foursquare.com. Please let me know solution.

    Awaiting your reply!!

    Reply
  • 2 years ago Charles Wright

    I am looking for someone who can build a list of business http://www.dexknows.com.

    Did you work in past with this project?

    Reply
  • 2 years ago Philip Davis

    Scrape restaurants information from zomato.com with menu and prices. Did you work in past with this kind of project?

    Reply
  • 2 years ago Ken Hull

    Scraping all the Plumbers and Electricians data from Yelp. Is this something you can do for us?

    Reply
  • 1 year ago Chris Randall

    I would like to extract Things to do in Amsterdam from tripadvisor.com. Browse through all categories and get all the data-fields available.

    Reply

Add a comment