Function Breakdown
scrapeEventGivenCity
This function uses the puppeteer cluster package to scrape event data from eventbrite. We use https://www.eventbrite.ca/d/${city}/all-events as a source. The overall flow is as follows:
- Scrape raw data from eventbrite using puppeteer (we scrape 3 pages, each page containing 20 events)
- Cleanup the raw data and add it into an array of events objects
- Add the events into firestore
- Update timestamp for when a city events were last updated
- Return events object array as a response
deleteOutdatedUserEvents
This function deletes all user events that are 2 hours after they were scheduled. This is to prevent users from seeing events that have already passed. The veral flow is as follows:
- Get all user events from firestore
- Check if the current time is greater than the event's scheduled time + 2 hours
- Delete all outdated events
- Return an array of deleted events as a response
periodicScraper
This function is meant to delete all the scraped outdated events for the top 10 oldest cities. The flow of the function is as follows:
- Get the timestamps for all the cities that was generated by the scrapeEventGivenCity function, when scraping for the respective cities.
- Sort all the timestamp based on the oldest to newest.
- Scrape the events for the top 10 outdated cities based on the sorted timestamps.