Mining for OSINT gold: RIR data via API

This is a guest post by David Mashburn (@d_mashburn,, Certified SANS Instructor and cyber ninja!

OSINT isn’t just about doing pre-attack recon. It is often leveraged by defenders as part of the incident response and investigation process. One of the most common applications of OSINT for a defender is to perform lookups on available information for IP addresses. This type of lookup can be performed via any of several different web pages, but for efficiency and integration into workflow, automation is essential. That means scripting, and often those scripts will leverage an API.

The first stop for IP information is typically one of the Regional Internet Registries (RIRs). Being based in the United States, I usually start with ARIN. However, as I have been working on automation for IP lookups, I found that handling the referral to other RIRs for non-ARIN IP blocks was hampered by the lack of a RESTful API at some of the registrars. I also noticed that the data returned could vary significantly, but my needs were really focused on a subset of that data.

ARIN offers a robust API, but for IP addresses that are not allocated to ARIN, the API returns a referral to the appropriate RIR in the response data. This makes sense, but makes the process of automation slightly more complex, as the responses must be checked for this referral. Ideally we would like get the data immediately rather than having to recurse to the answer. RIPE NCC, the RIR for Europe, also offers an API for searching whois data. This still doesn’t solve the issue of one-stop shopping, but RIPE has another API called RIPEstat. This API offers a different look at the data, and has two invaluable features: a lookup for abuse contact info, and a whois query that “returns whois information from the relevant Regional Internet Registry and Routing Registry.” This sounds like exactly what we would like to have – a single call to a source to get data on any IP address.

Let’s look at some examples of these API calls in action using the command line. Using cURL to make the REST API call and jq to parse out key data from the JSON response, we can get a quick set of useful data on an IP address.

curl --silent --header "Accept: application/json"

That returns the set of available data from ARIN for that IP address in JSON. Let’s pipe that output into jq, and pull out the key fields of interest. In this case, we want to get data regarding the IP owner. Sometimes you may get that in the form of customer data, and other times it may be organizational data. Let’s focus on pulling out a few data elements with jq, as shown below, such as the organization name, start address, and end address.

 curl --silent --header "Accept: application/json" | jq '.net.orgRef["@name"],.net.netBlocks.netBlock.startAddress["$"],.net.netBlocks.netBlock.endAddress["$"]'
"Microsoft Corporation"

As you see, you can reduce the full output to a concise view of specific data, and that helps immensely when doing triage. Even better would be to ingest this data and enrich your logs with this data, but that’s getting a bit ahead of where we are. This also doesn’t address the major other issue we may encounter, specifically what about an address that is not allocated to ARIN? What would be seen in that case?

curl --silent --header "Accept: application/json" | jq '.net.orgRef["@name"]'
"African Network Information Center"

ARIN dutifully notifies us that this IP is allocated to AFRINIC, but lacks any details. Perhaps my internet search skills have failed me, but it doesn’t appear that there is an option to query AFRINIC whois through a REST API. This also seems to be the case at LACNIC, and APNIC has a draft of a proposed REST API but nothing working as of the time of this post. However, it appears that our cry of “help us RIPE NCC, you’re our only hope” has not gone unanswered.

RIPE has multiple REST APIs for public use. I did some investigation with the WHOIS REST API, and it offers similar functionality as AARIN, but still is limited to RIPE IP addresses. However, another API, RIPEstat, has a some interesting data calls that can solve this problem.

Specifically, the Whois data call states that the return data will be

“whois information from the relevant Regional Internet Registry and Routing Registry.”

That sounds like exactly what we need. Making the call to the RIPEstat API yields the following data for the IP addresses (, from our previous queries. The output is then filtered through jq to show only the RIR.

curl --silent --header "Accept: application/json" | jq '.data.authorities[0]'

curl –silent –header “Accept: application/json” | jq ‘.data.authorities[0]’

Now that output does not prove that the data returned is exactly what we want, but we can do additional parsing to extract data. This is where we start to see the need to move beyond a simple curl call on the command line to a script that can do more sophisticated parsing of JSON data. This is for several notable reasons: the JSON data returned from RIPEstat is a series of nested lists, the data elements returned may differ, and the JSON key-value pairs themselves. As shown below, all data from this data call is returned with the same set of key names for all data elements in a list.

  "key": "OrgName",
  "value": "Microsoft Corporation", 
  "details_link": null
  "key": "OrgId",  
  "value": "MSFT",
  "details_link": null

The snippet shows that we need cannot do a simple lookup up on a key name, but must rather extract the value of the key to get the data element, and then extract the value to get the information that we want. This, coupled with the variable data elements that are returned, means that we have to graduate to something more sophisticated.

However, we can still do a quick extraction of important data. Instead of trying to use jq to specify an exact location in the JSON, we can note a few things about the data structure: the JSON elements always contain the keys key, value, and details_link; the names of the data elements are consistent if they are present (OrgName, OrgId, etc); and if we pass the output of our call to jq with a simple filter to only look at the records for the IP address, we could just use grep to grab the data we want.

 curl --silent --header "Accept: application/json" | jq -S '.data.records' | grep -E -A 1 -i 'orgname|cidr|city|state|prov|country|descr' | grep -v '\-\-'
 "key": "descr",
 "value": "Glo Mobile Ghana Telco"
 "key": "country",
 "value": "GH"

curl –silent –header “Accept: application/json” | jq -S ‘.data.records’ | grep -E -A 1 -i ‘orgname|cidr|city|state|prov|country|descr’ | grep -v ‘\-\-‘
“key”: “CIDR”,
“value”: “”
“key”: “CIDR”,
“value”: “”
“key”: “OrgName”,
“value”: “American Registry for Internet Numbers”
“key”: “City”,
“value”: “Centreville”
“key”: “StateProv”,
“value”: “VA”
“key”: “Country”,
“value”: “US”
“key”: “OrgName”,
“value”: “Microsoft Corporation”
“key”: “City”,
“value”: “Redmond”
“key”: “StateProv”,
“value”: “WA”
“key”: “Country”,
“value”: “US”

So we queried RIPEstat with an AFRINIC and an ARIN address and still received information on each IP rather than a referral. Now we can query a single RIR source to get information on any IP address, without needing an API key, and without cost. That is a big help, but clearly we will need to improve the data parsing. While we can get a rough look with this approach, it would be helpful to get a more consistent view. In the next installment, we will work on migrating this lookup to a Python script.

Comments are closed.

Up ↑

%d bloggers like this: