Scraping Lawyers and Accountants Database
Project Title: Scraping Lawyers and Accountants Database
Lawyers: All possible details of the office and the lawyer need to be collected.
This is a complete paginated list with offices and from there you can click to layers in that firm
From this you can create two lists; one for the offices and one for the lawyers.
In de url for the office is a unique ID at the end. Let’s save this ID with each office and use this link in each lawyer record to link them.
Each lawyer has specialised areas (called ‘rechtsgebebieden’ in dutch). Those areas are very important.
There is also a unique ID in the URL per lawyer. This should also be saved
There should be around 18236 lawyers and 5698 offices
Accountants (has two sites):
Again, alle possible details of both sites need to be collected.
Offices – https://www.nba.nl/kantorenzoeker/
This one is very easy, the data, 2969 offices are already on the first page in JSON. Check “Offices” request for this.
You can search this site per two letters (AA, AB, … ZZ)
No pagination – but you can get more on the same page by clicking on “toon meer”
It’s pretty slow.
It has a unique ID in de URL called “Leden kunnen inloggen voor meer informatie”. We want that ID also.
We like to have the offices and accountants linked. Which should be possible by comparing the address “Kantooradres” and the street information in the JSON and creating an unique ID
For similar work requirements feel free to email us on firstname.lastname@example.org.