Number of returned URLs:
Tool is used for mining data from the Internet Archive Wayback Machine. Getting old website URLs might be very important if you forgot to redirect URLs from your previous website to the new one. Ti might be a problem if you don’t have your own URL archive. Fortunately, Wayback Machine has the largest publically accessible archive of URLs. And thanks to IAWM Extractor you can get historical data and use it for any purpose. Quick and easy.
Internet Archive Wayback Machine (IAWM) is an excellent tool with an amazing and super helpful database. Unfortunately, it’s a bit hard to work with it, or it’s API if you don’t have any coding background. Therefore I decided to create a free tool to use IAWM API to some good and offer basic data via my tool.
Into the text field you input domain for which you want to get historical URLs. The tool will return up to 10.000 URLs. How many URLs were extracted is shown under “Number of returned URLs“. URLs will be ready in the text field below. To copy everything to the clipboard, you can use button “Copy URLs“.
Tool will be expanded with new functions and settings. Right now we are automatically deduplicating and cleaning messy URLs to provide nicer and ready-to-go dataset.
For each domain, you can get only 10.000 URLs. And you can get only basic dataset without any data expansions.
Right now the tool is in limited beta. In the future, we will provide more URLs and data.
If you need more data without waiting, please, use the contact form at the end of the FAQ. We will try to help you.
This shouldn’t happen. We are automatically removing this from the final dataset. If you come across this issue, please, report a bug using the contact form at the end of this page.
What is :80 in URL? It’s default web port for standard HTTP Session. The string is added into URL during the archival process. This is not a mistake. But we understand it might be confusing and it’s not good to have such redundant strings in your data. Therefore IAWM Extractor removes it for you to make your further work smoother.
You can read more about HTTP Session and port 80 on Wikipedia.
Sorry for any inconveniences. Most of the data should be returned in a few seconds. Sometimes request might take a bit longer, especially if there is a lot of data to process. But even in such case, it should take only up to 30-60 seconds.
If it takes a very long time or doesn’t work at all there is probably some third party issue. This tool relies on API and proxy server. Which we can’t influence.
In the future we will show debug info if API and proxies are online and working correctly. Right now, try to wait until it works again. If it takes more than 24 hours, please use the contact form below to report serious bug.