Update on the UBL feed
Aug 19, 2014 7:00am
The APWG’s URL Block List (UBL) began life as a redundant secure FTP service (the ancient and venerable APWG PhishFeed), inspired by industry demand in late 2003 for a clearance mechanism for phishing report exchange and routing. As the user base matured, demand increased for more flexible interface schemes, spurring APWG Engineering to develop a number of interactive services for the UBL under HTTPS on the APWG eCrime Exchange (eCX) found at https://www.ecrimex.net.
This APWG Members Data Advisory covers the various ways to acquire UBL data. First, a quick overview of the eCX to frame the larger discussion.
eCX’s prime directive is to induct threat data from external sources, normalize them into known formats and feeds, and send the aggregated data back out to APWG members via pulls and pushes, just as a network switch routes traffic, based on internal configurations and rules, to prescribed endpoints. eCX, as a switch, works to intelligently pipe cybercrime machine event data to APWG members according to the user interest and user authority within an application milieu. (If you’ve an interest in eCX workgroups functionality, ask us for the eCX User Guide.)
APWG members sign up to be registered on the eCX, (more on that in the 'Getting Started' section, located below) and then apply for access into the various programs within eCX to gain access to specific data, such as the URL Block List. The UBL is the initial program within eCX, and the focus of today's email:
Currently eCX supports 4 different access points into the UBL data:
1) External access to the UBL download.
This is the replicated PhishFeed FTP pull of the phishing report data. eCX provides cURL and wget scripts that allow APWG members to gain access to the data programatically via HTTPS. Once admitted as a UBL user, users can automate scripts to grab data at whatever intervals are required. There is no throttling. The data is preconfigured in a few different files based on age in CSV and XML internal file layouts with the files compressed via gzip. eCX provides a rolling "last 72 hours" that is updated every 5 minutes, along with the last week, month, 6 months, and one year durations available. A UBL User Guide is available upon request at user enrollment.
- Documentation for external download script: https://www.ecrimex.net/manuals/ECX_Scripted_Downloads_for_URL_Block_List_Instructions.pdf
2) Internal eCX download
Within the eCX, the UBL web UI provides access to the exact same files CSV and XML files in the same durations available in the External download. Once an enrolled user logs into the UBL application at https://www.ecrimex.net they'll see a "Download" area, which displays a tabbed menu of all the various durations and the CSV/XML output formats.
- To access the UBL downloads from within eCX, use the menu system at the top of the page by clicking eCX Applications->Block List and then clicking the Download tab. Links to the Direct Downloads, instructions for external/remote downloads with examples, and a link to the UBL API (see below) documentation are all available on the Downloads tab.
3) Filtered UI View Data
The UBL program within eCX also provides a web interface for users to view and filter data in a smart "datatable" layout. Users can select the number of records they'd like to see at one time and filter search results according to any of URL string, brand, time stamp and/or confidence factor. Once a filtered set is isolated, the user can export and download the data to his system in a variety of formats. Users, within the eCX framework, can also "tag" the records in the results, create a topic-specific workgroup and invite others to join him to collaborate. This view of the data is also throttle free, any matching phish within the 1.5 million eCX UBL records is available. In addition the Filtered UI View differs a bit from the feeds 1 & 2 above - during the first 4 hours after we first receive a phish URL we ping the domain every 5 minutes noting the initial IP and checking to see if the IP is changing noting any changes. Because this "first 4 hours" IP data is available in the Filtered UI View you may also, as expected, filter by IP with the ability to use partial IPs looking for commonality.
- Documentation is available at: https://www.ecrimex.net/manuals/ECX_UBL_and_Whitelist_Applications_User_Guide.pd , pages 4 to 6
4) The new UBL API
APWG Engineering is moving eCX towards an API-centric model, making it easier to submit and consume data via common REST methods in order to get quick access to the data. In this first non-REST release of our UBL API, users may consume the data that they are interested in via a robust set of HTTP parameters. The is the most robust of the interfaces, allowing access to things like confidence ranges, wildcard searching, control over start and end dates to allow users to look back for data as far back as late 2009, and support for "first 4 hour" IP ranges. Additionally eCX users can select the option to download data in JSON or XML format (no CSV option currently exists) but data is limited to only the first 2,000 matching records. If one finds that a need for more than 2,000 records please contact APWG Engineeringto customize access for the extraordinary scenario. There is a UBL API workgroup that eCX users can apply to participate within eCX that contains the API documentation. Going forward, the API style of interaction will be the primary focus for all areas within eCX, both for submitting and consuming all forms of cyber-crime event data within the eCX "switch” architecture.
- Documentation is available at: https://www.ecrimex.net/manuals/ECX_URL_Block_List_API.pdf
If you are not currently a registered eCX user, request access by sending a note to APWG Engineering at firstname.lastname@example.org. An invite will be issued via email which will link you to the eCX registration page. Sign-up is a simple one-minute process. “Test drive" sponsorship levels are available to evaluate the eCX and the UBLís new services. Once enrolled as a registered user within eCX you'll need to apply for access into the UBL program. Use the menu system at the top of the page by clicking eCX Applications->Block List, and once on the landing page you'll see the link to apply for access. Once approved for access you'll be able to use any or all of the methods above to access the UBL data.
As always, if you have any questions or input please feel free to get in touch with myself or any of the APWG managers. We'll be happy to answer questions, provide online demo's or screencasts to help get you going.
In the next installment I'll discuss some of the upcoming changes to the UBL feed and what we envision the future holds for the eCX UBL, along with a quick backgrounder on some of the work APWG Engineering has completed to bring some of the UBL data that was archived back online for member research and applications.