Azure AD Research with ROADtools
Contents
Azure AD Research with ROADtools
#
2022-07-17
In this post, I show how to inspect Graph API metadata and describe how these relate to objects in Azure AD. I share how the invaluable project
ROADtools
uses this metadata to represent an Azure AD tenant in the form of a relational database. I outline several approaches to analyze this database using native SQL, thesqlalchemy
ORM, and thepandas
data analysis library. Lastly, I introduce a proof-of-conceptROADtools
plugin namedROAD2timeline
that makes timeline analysis of Azure AD objects quick and easy.
Azure AD Data Model#
The Graph APIs expose the data model for Azure AD and its directory objects. The legacy Azure AD Graph API and the modern Microsoft Graph API are both OData REST APIs. OData REST APIs provide descriptive XML metadata about their endpoints, entities, relationships, URL parameters, and response types to make it easier for clients to consume. OData metadata serves the same purpose as the more recent OpenAPI specification.
In this post, I focus on the service principal data model for demonstration purposes. This helps contrast how the same Azure AD object can be represented in different paradigms. I will cover how to work with the service principal data model as:
OData entities
Tables in a SQL database
Python classes
pandas
DataFrames
Each of these approaches has trade-offs, but all afford the same benefit: convenient programmatic access to low-level information about the current state of an Azure AD tenant. This information is tremendously useful for offensive, defensive, and research purposes.
Download Graph Metadata#
Warning
If you are already familiar with the basic concepts of OData, you can skip ahead to the next section.
We can download the OData metadata for the Graph API(s) in XML format using curl
or any other HTTP client.
az rest --url "https://graph.windows.net/\$metadata" --output-file legacy_graph_metadata.xml
az rest --url "https://graph.microsoft.com/beta/\$metadata" --output-file modern_graph_beta_metadata.xml
We can now explore the data structures using XPath queries. Let’s take a look at how service principal is defined in the legacy Azure AD Graph API OData metadata:
import xml.etree.cElementTree as ET
import xml.dom.minidom
# Read the OData XML metadata file for
# the legacy Graph API.
graph_metadata = ET.parse("legacy_graph_metadata.xml")
# Search for `EntityType` elements in any namespace
# with the `Name` attribute of "ServicePrincipal".
# Note that entity names are PascalCase in the
# legacy Graph, but are camelCase in the modern
# Graph metadata.
service_principal_entity = graph_metadata.find(
'.//{*}EntityType[@Name="ServicePrincipal"]'
)
print(
xml.dom.minidom.parseString(
ET.tostring(service_principal_entity, encoding="unicode")
).toprettyxml()
)
<?xml version="1.0" ?>
<ns0:EntityType xmlns:ns0="http://schemas.microsoft.com/ado/2009/11/edm" Name="ServicePrincipal" BaseType="Microsoft.DirectoryServices.DirectoryObject" OpenType="true">
<ns0:Property Name="accountEnabled" Type="Edm.Boolean"/>
<ns0:Property Name="addIns" Type="Collection(Microsoft.DirectoryServices.AddIn)" Nullable="false"/>
<ns0:Property Name="alternativeNames" Type="Collection(Edm.String)" Nullable="false"/>
<ns0:Property Name="appDisplayName" Type="Edm.String"/>
<ns0:Property Name="appId" Type="Edm.String"/>
<ns0:Property Name="applicationTemplateId" Type="Edm.String"/>
<ns0:Property Name="appOwnerTenantId" Type="Edm.Guid"/>
<ns0:Property Name="appRoleAssignmentRequired" Type="Edm.Boolean" Nullable="false"/>
<ns0:Property Name="appRoles" Type="Collection(Microsoft.DirectoryServices.AppRole)" Nullable="false"/>
<ns0:Property Name="displayName" Type="Edm.String"/>
<ns0:Property Name="errorUrl" Type="Edm.String"/>
<ns0:Property Name="homepage" Type="Edm.String"/>
<ns0:Property Name="informationalUrls" Type="Microsoft.DirectoryServices.InformationalUrl"/>
<ns0:Property Name="keyCredentials" Type="Collection(Microsoft.DirectoryServices.KeyCredential)" Nullable="false"/>
<ns0:Property Name="logoutUrl" Type="Edm.String"/>
<ns0:Property Name="notificationEmailAddresses" Type="Collection(Edm.String)" Nullable="false"/>
<ns0:Property Name="oauth2Permissions" Type="Collection(Microsoft.DirectoryServices.OAuth2Permission)" Nullable="false"/>
<ns0:Property Name="passwordCredentials" Type="Collection(Microsoft.DirectoryServices.PasswordCredential)" Nullable="false"/>
<ns0:Property Name="preferredSingleSignOnMode" Type="Edm.String"/>
<ns0:Property Name="preferredTokenSigningKeyEndDateTime" Type="Edm.DateTime"/>
<ns0:Property Name="preferredTokenSigningKeyThumbprint" Type="Edm.String"/>
<ns0:Property Name="publisherName" Type="Edm.String"/>
<ns0:Property Name="replyUrls" Type="Collection(Edm.String)" Nullable="false"/>
<ns0:Property Name="samlMetadataUrl" Type="Edm.String"/>
<ns0:Property Name="samlSingleSignOnSettings" Type="Microsoft.DirectoryServices.SamlSingleSignOnSettings"/>
<ns0:Property Name="servicePrincipalNames" Type="Collection(Edm.String)" Nullable="false"/>
<ns0:Property Name="servicePrincipalType" Type="Edm.String"/>
<ns0:Property Name="signInAudience" Type="Edm.String"/>
<ns0:Property Name="tags" Type="Collection(Edm.String)" Nullable="false"/>
<ns0:Property Name="tokenEncryptionKeyId" Type="Edm.Guid"/>
<ns0:NavigationProperty Name="appRoleAssignedTo" Relationship="Microsoft.DirectoryServices.Microsoft_DirectoryServices_ServicePrincipal_appRoleAssignedTo_Microsoft_DirectoryServices_AppRoleAssignment_appRoleAssignedToPartner" ToRole="appRoleAssignedTo" FromRole="appRoleAssignedToPartner" ContainsTarget="true"/>
<ns0:NavigationProperty Name="appRoleAssignments" Relationship="Microsoft.DirectoryServices.Microsoft_DirectoryServices_ServicePrincipal_appRoleAssignments_Microsoft_DirectoryServices_AppRoleAssignment_appRoleAssignmentsPartner" ToRole="appRoleAssignments" FromRole="appRoleAssignmentsPartner" ContainsTarget="true"/>
<ns0:NavigationProperty Name="oauth2PermissionGrants" Relationship="Microsoft.DirectoryServices.Microsoft_DirectoryServices_ServicePrincipal_oauth2PermissionGrants_Microsoft_DirectoryServices_OAuth2PermissionGrant_oauth2PermissionGrantsPartner" ToRole="oauth2PermissionGrants" FromRole="oauth2PermissionGrantsPartner" ContainsTarget="true"/>
<ns0:NavigationProperty Name="policies" Relationship="Microsoft.DirectoryServices.Microsoft_DirectoryServices_ServicePrincipal_policies_Microsoft_DirectoryServices_DirectoryObject_policiesPartner" ToRole="policies" FromRole="policiesPartner"/>
<ns0:NavigationProperty Name="serviceEndpoints" Relationship="Microsoft.DirectoryServices.Microsoft_DirectoryServices_ServicePrincipal_serviceEndpoints_Microsoft_DirectoryServices_ServiceEndpoint_serviceEndpointsPartner" ToRole="serviceEndpoints" FromRole="serviceEndpointsPartner" ContainsTarget="true"/>
</ns0:EntityType>
Anatomy of a Service Principal#
An EntityType
element defines the service principal object in the Azure AD data model.
Entities have a primary Key
and one or more named Property
elements, which represent information of a specific data type.
These properties are the public attributes of Azure AD objects.
For investigators, each entity and property could be a piece of evidence. Therefore, it is helpful to understand these data structures and how threat activity appears in them.
The element <Property Name="appId" Type="Edm.String"/>
indicates that service principals have an appId
attribute, the value of which is stored as a string.
Likewise, the element <Property Name="preferredTokenSigningKeyEndDateTime" Type="Edm.DateTime"/>
indicates the preferredTokenSigningKeyEndDateTime
attribute on a service principal is a timestamp.
Properties may also be a Collection
, which means the property contains an array of values.
Named Properties
<EntityType xmlns="http://schemas.microsoft.com/ado/2009/11/edm" Name="ServicePrincipal" BaseType="Microsoft.DirectoryServices.DirectoryObject" OpenType="true">
<Property Name="accountEnabled" Type="Edm.Boolean"/>
<Property Name="addIns" Type="Collection(Microsoft.DirectoryServices.AddIn)" Nullable="false"/>
<Property Name="alternativeNames" Type="Collection(Edm.String)" Nullable="false"/>
<Property Name="appDisplayName" Type="Edm.String"/>
<Property Name="appId" Type="Edm.String"/>
<!-- ...snip... -->
<Property Name="passwordCredentials" Type="Collection(Microsoft.DirectoryServices.PasswordCredential)" Nullable="false"/>
<Property Name="preferredSingleSignOnMode" Type="Edm.String"/>
<Property Name="preferredTokenSigningKeyEndDateTime" Type="Edm.DateTime"/>
<!-- ...snip... -->
</EntityType>
All OData entities contain properties. Properties have names and types.
Entities can inherit properties from other entities as their BaseType
.
For the legacy Graph API, service principals are derived from the base type Microsoft.DirectoryServices.DirectoryObject
and implement all of its properties.
Importantly, service principals inherit their primary key property - named objectId
- from this base type.
Inheritance
<EntityType xmlns="http://schemas.microsoft.com/ado/2009/11/edm" Name="DirectoryObject" OpenType="true">
<Key>
<PropertyRef Name="objectId"/>
</Key>
<Property Name="objectType" Type="Edm.String"/>
<Property Name="objectId" Type="Edm.String" Nullable="false"/>
<Property Name="deletionTimestamp" Type="Edm.DateTime"/>
<!-- ... snip ... -->
</EntityType>
Some ServicePrincipal
properties are derived from the DirectoryObject
base type.
Notice the deletionTimestamp
property, which may have forensic significance.
Furthermore, properties can be nested using ComplexType
.
The passwordCredentials
property on a service principal is a collection of Microsoft.DirectoryServices.PasswordCredential
objects, which are complex types.
If we run the XPath query .//{*}ComplexType[@Name="PasswordCredential"]
, we can see that the complex type PasswordCredential
contains multiple properties.
And once again, note that some of these properties contain timestamps.
Nested Properties
<ComplexType xmlns="http://schemas.microsoft.com/ado/2009/11/edm" Name="PasswordCredential">
<Property Name="customKeyIdentifier" Type="Edm.Binary"/>
<Property Name="endDate" Type="Edm.DateTime"/>
<Property Name="keyId" Type="Edm.Guid"/>
<Property Name="startDate" Type="Edm.DateTime"/>
<Property Name="value" Type="Edm.String"/>
</ComplexType>
If an attacker performs technique T1098.001 - Account Manipulation: Additional Cloud Credentials to add a password credential to a service principal, it will appear in the passwordCredentials
property of a ServicePrincipal
object.
Lastly, NavigationProperty
elements represent a relationship between an entities.
For example, the appRoleAssignments
navigation property will be populated when an application permission is granted to a service principal.
The primary key for the service principal appears as the foreign key principalId
in an AppRoleAssignment
entity:
Relational Fields
<!-- ...snip... -->
<NavigationProperty Name="appRoleAssignments" Relationship="Microsoft.DirectoryServices.Microsoft_DirectoryServices_ServicePrincipal_appRoleAssignments_Microsoft_DirectoryServices_AppRoleAssignment_appRoleAssignmentsPartner" ToRole="appRoleAssignments" FromRole="appRoleAssignmentsPartner" ContainsTarget="true"/>
<!-- ...snip... -->
<EntityType xmlns="http://schemas.microsoft.com/ado/2009/11/edm" Name="AppRoleAssignment" BaseType="Microsoft.DirectoryServices.DirectoryObject" OpenType="true">
<Property Name="creationTimestamp" Type="Edm.DateTime"/>
<Property Name="id" Type="Edm.Guid" Nullable="false"/>
<Property Name="principalDisplayName" Type="Edm.String"/>
<Property Name="principalId" Type="Edm.Guid"/> <!-- servicePrincipal.objectId -->
<Property Name="principalType" Type="Edm.String"/>
<Property Name="resourceDisplayName" Type="Edm.String"/>
<Property Name="resourceId" Type="Edm.Guid"/>
</EntityType>
The primary key for a ServicePrincipal
is its inheirited objectId
property.
This relates to the principalId
property on an AppRoleAssignment
entity, if it has consent for an application permission.
Compare with Graph API Response#
When we call a Graph API endpoint, we can see that the response conforms to the data structures described in the OData metadata. For example, let’s query for information about service principals in a given Azure AD tenant:
# If not already logged in
az login --use-device-code
# Legacy Graph query
az rest --url "https://graph.windows.net/bf016c83-cab4-4919-98b0-ab45e3acd4b5/servicePrincipals?api-version=1.6" --query "value[0]"
# Modern Graph equivalent
#!az rest --url "https://graph.microsoft.com/beta/bf016c83-cab4-4919-98b0-ab45e3acd4b5/servicePrincipals" --query "value[0]"
{
"accountEnabled": true,
"addIns": [],
"alternativeNames": [],
"appDisplayName": "O365Account",
"appId": "1cda9b54-9852-4a5a-96d4-c2ab174f9edf",
"appOwnerTenantId": "f8cdef31-a31e-4b4a-93e4-5f571e91255a",
"appRoleAssignmentRequired": false,
"appRoles": [],
"applicationTemplateId": null,
"deletionTimestamp": null,
"displayName": "O365Account",
"errorUrl": null,
"homepage": null,
"informationalUrls": {
"marketing": null,
"privacy": null,
"support": null,
"termsOfService": null
},
"keyCredentials": [],
"logoutUrl": null,
"notificationEmailAddresses": [],
"oauth2Permissions": [],
"objectId": "b2ee120f-7070-434d-821a-163b8ee63c57",
"objectType": "ServicePrincipal",
"odata.type": "Microsoft.DirectoryServices.ServicePrincipal",
"passwordCredentials": [
{
"customKeyIdentifier": null,
"endDate": "2022-10-21T20:40:45.6027957Z",
"keyId": "d8824efe-0654-444b-a163-c16d3049380b",
"startDate": "2021-10-21T20:40:45.6027957Z",
"value": null
},
{
"customKeyIdentifier": null,
"endDate": "2022-10-20T14:40:06.1955755Z",
"keyId": "9f5a00c7-664e-43f7-9f8b-bffed48d9690",
"startDate": "2021-10-20T14:40:06.1955755Z",
"value": null
},
{
"customKeyIdentifier": null,
"endDate": "2022-10-20T14:32:29.8531814Z",
"keyId": "cdd04e28-7681-4022-8902-236243c0a085",
"startDate": "2021-10-20T14:32:29.8531814Z",
"value": null
}
],
"preferredSingleSignOnMode": null,
"preferredTokenSigningKeyEndDateTime": null,
"preferredTokenSigningKeyThumbprint": null,
"publisherName": "Microsoft Services",
"replyUrls": [],
"samlMetadataUrl": null,
"samlSingleSignOnSettings": null,
"servicePrincipalNames": [
"1cda9b54-9852-4a5a-96d4-c2ab174f9edf"
],
"servicePrincipalType": "Application",
"signInAudience": "AzureADMultipleOrgs",
"tags": [],
"tokenEncryptionKeyId": null
}
Notice how the keys in the JSON response align with the Property
elements in the OData metadata.
This section focused on service principals, but again these concepts apply to all Azure AD objects. While the OData metadata for the Graph APIs provides a rich description of Azure AD objects and their attributes, investigators still need a way to collect the potential evidence.
ROADtools
#
ROADtools
is an open source Azure AD analysis framework.
ROADtools
exports directory information from an Azure AD tenant using the legacy Azure AD Graph API and stores the results in a SQL database for offline analysis. ROADtools
provides a web client to conveniently explore the exported Azure AD data, but you can also interact directly with its SQLite database using your preferred tools.
I have used
ROADtools
extensively for incident response and research purposes. It is an excellent way to take a “snapshot” of a potentially compromise Azure AD tenant for subsequent forensic analysis.
ROADtools
uses the Azure AD Graph OData metadata discussed in the previous section to dynamically generate a SQL database schema. OData entities are mapped to SQL tables
with their properties as columns and navigation properties define the relationships between tables.
ROADtools
queries the legacy Azure AD Graph and stores the results as rows in this database.
By default, ROADtools
will create a single SQLite database file named roadrecon.db
.
Getting Started#
Let’s walk through how to use ROADtools
to dump data from an Azure AD tenant.
Firstly, we need to install ROADtools
, if we have not already.
pip install roadtools sqlite-utils
Next, we need to authenticate to Azure AD so we can call the Graph API. Here we are using a device code flow, but ROADtools
supports other forms of authentication.
roadrecon auth --device-code
Now we are ready to dump data from Azure AD. We use the roadrecon gather
command to make concurrent requests to the legacy Azure AD Graph API. The responses populate the appropriate tables in the sqlite
dataqbase.
roadrecon gather
Once it has finished gathering, we should have a roadrecon.db
file in our current working directory.
ls -al | grep roadrecon.db
-rw-r--r-- 1 vscode vscode 2797568 Jul 5 21:31 roadrecon.db
Explore roadrecon.db
#
Now that ROADtools
has exported data from our Azure AD tenant, we can analyze the roadrecon.db
database using whatever tools make us the most comfortable. If this is your first time using ROADtools
, I recommend browsing the web interface to get oriented.
I am more interested in automated ways of analyzing the Azure AD data in roadrecon.db
, so here are a few programmatic approaches that ultimately provide similar outcomes:
Using Native SQL#
Since roadrecon.db
is just a SQLite database file, we can use an off-the-shelf SQLite browser such as DB Browser for SQLite for interactive use. But I will show how to use the incredible sqlite-utils
library to poke at the database instead.
Firstly, we can dump the schema from the ServicePrincipals
table and compare it with the OData metadata we covered earlier.
Notice how the columns in the table map to the properties in the OData entity, but the types are slightly different.
It is common for SQL databases to represent nested data (such as an OData ComplexType
) as a TEXT
or JSON
column.
That is the case for the passwordCredentials
column below.
sqlite-utils schema roadrecon.db ServicePrincipals
CREATE TABLE "ServicePrincipals" (
"objectType" TEXT,
"objectId" TEXT NOT NULL,
"deletionTimestamp" DATETIME,
"accountEnabled" BOOLEAN,
"addIns" TEXT,
"alternativeNames" TEXT,
"appBranding" TEXT,
"appCategory" TEXT,
"appData" TEXT,
"appDisplayName" TEXT,
"appId" TEXT,
"applicationTemplateId" TEXT,
"appMetadata" TEXT,
"appOwnerTenantId" TEXT,
"appRoleAssignmentRequired" BOOLEAN,
"appRoles" TEXT,
"authenticationPolicy" TEXT,
"disabledByMicrosoftStatus" TEXT,
"displayName" TEXT,
"errorUrl" TEXT,
homepage TEXT,
"informationalUrls" TEXT,
"keyCredentials" TEXT,
"logoutUrl" TEXT,
"managedIdentityResourceId" TEXT,
"microsoftFirstParty" BOOLEAN,
"notificationEmailAddresses" TEXT,
"oauth2Permissions" TEXT,
"passwordCredentials" TEXT,
"preferredSingleSignOnMode" TEXT,
"preferredTokenSigningKeyEndDateTime" DATETIME,
"preferredTokenSigningKeyThumbprint" TEXT,
"publisherName" TEXT,
"replyUrls" TEXT,
"samlMetadataUrl" TEXT,
"samlSingleSignOnSettings" TEXT,
"servicePrincipalNames" TEXT,
tags TEXT,
"tokenEncryptionKeyId" TEXT,
"servicePrincipalType" TEXT,
"useCustomTokenSigningKey" BOOLEAN,
"verifiedPublisher" TEXT,
PRIMARY KEY ("objectId"),
CHECK ("accountEnabled" IN (0, 1)),
CHECK ("appRoleAssignmentRequired" IN (0, 1)),
CHECK ("microsoftFirstParty" IN (0, 1)),
CHECK ("useCustomTokenSigningKey" IN (0, 1))
)
Obviously, we can use SQL to query the database. Here we are just querying for a single row from service principal table.
sqlite-utils roadrecon.db "SELECT * FROM ServicePrincipals LIMIT 1;" | python3 -mjson.tool
[
{
"objectType": "ServicePrincipal",
"objectId": "b2ee120f-7070-434d-821a-163b8ee63c57",
"deletionTimestamp": null,
"accountEnabled": 1,
"addIns": "[]",
"alternativeNames": "[]",
"appBranding": null,
"appCategory": null,
"appData": null,
"appDisplayName": "O365Account",
"appId": "1cda9b54-9852-4a5a-96d4-c2ab174f9edf",
"applicationTemplateId": null,
"appMetadata": null,
"appOwnerTenantId": "f8cdef31-a31e-4b4a-93e4-5f571e91255a",
"appRoleAssignmentRequired": 0,
"appRoles": "[]",
"authenticationPolicy": null,
"disabledByMicrosoftStatus": null,
"displayName": "O365Account",
"errorUrl": null,
"homepage": null,
"informationalUrls": "{\"termsOfService\": null, \"support\": null, \"privacy\": null, \"marketing\": null}",
"keyCredentials": "[]",
"logoutUrl": null,
"managedIdentityResourceId": null,
"microsoftFirstParty": 1,
"notificationEmailAddresses": "[]",
"oauth2Permissions": "[]",
"passwordCredentials": "[{\"customKeyIdentifier\": null, \"endDate\": \"2022-10-21T20:40:45.6027957Z\", \"keyId\": \"d8824efe-0654-444b-a163-c16d3049380b\", \"startDate\": \"2021-10-21T20:40:45.6027957Z\", \"value\": null}, {\"customKeyIdentifier\": null, \"endDate\": \"2022-10-20T14:40:06.1955755Z\", \"keyId\": \"9f5a00c7-664e-43f7-9f8b-bffed48d9690\", \"startDate\": \"2021-10-20T14:40:06.1955755Z\", \"value\": null}, {\"customKeyIdentifier\": null, \"endDate\": \"2022-10-20T14:32:29.8531814Z\", \"keyId\": \"cdd04e28-7681-4022-8902-236243c0a085\", \"startDate\": \"2021-10-20T14:32:29.8531814Z\", \"value\": null}]",
"preferredSingleSignOnMode": null,
"preferredTokenSigningKeyEndDateTime": null,
"preferredTokenSigningKeyThumbprint": null,
"publisherName": "Microsoft Services",
"replyUrls": "[]",
"samlMetadataUrl": null,
"samlSingleSignOnSettings": null,
"servicePrincipalNames": "[\"1cda9b54-9852-4a5a-96d4-c2ab174f9edf\"]",
"tags": "[]",
"tokenEncryptionKeyId": null,
"servicePrincipalType": "Application",
"useCustomTokenSigningKey": null,
"verifiedPublisher": "{\"displayName\": null, \"verifiedPublisherId\": null, \"addedDateTime\": null}"
}
]
The native SQL approach to working with roadrecon.db
really shines if you are very comfortable writing expressive SQL queries.
SQLite also supports attaching multiple databases in a single session.
This lets you easily “federate” a SQL query over multiple roadrecon.db
databases, perhaps taken from different Azure AD tenants or taken from a different time period.
Using sqlalchemy
ORM#
We can also query the SQLite database using the sqlalchemy
Object Relational Mapping (ORM) that ROADtools
created for us. ORMs include Python classes that correspond to tables in the database. This provides auto-completion support, better serialization/deserialization, and other quality-of-life improvements when interacting with the Azure AD data in an IDE. The backend for the ROADtools
web interface primarily uses the ORM to interact with data in the roadrecon.db
.
Just as we parsed the XML OData metadata and dumped the SQLite schema using sqlite-utils
, we can do the same using the sqlalchemy
ORM to list available tables and their respective columns.
import sqlalchemy
import roadtools.roadlib.metadef.database as database
# Defaults to `roadrecon.db`
engine = database.init()
# Use reflection to collect
# metadata related to SQL tables
# and their columns.
metadata = sqlalchemy.MetaData(bind=engine, schema="main")
metadata.reflect()
for table in metadata.sorted_tables:
if table.name == "ServicePrincipals":
print(f"\nTable: {table.name}")
for column in table.columns:
print(f'\t|_ {column.name} ({column.type})')
Table: ServicePrincipals
|_ objectType (TEXT)
|_ objectId (TEXT)
|_ deletionTimestamp (DATETIME)
|_ accountEnabled (BOOLEAN)
|_ addIns (TEXT)
|_ alternativeNames (TEXT)
|_ appBranding (TEXT)
|_ appCategory (TEXT)
|_ appData (TEXT)
|_ appDisplayName (TEXT)
|_ appId (TEXT)
|_ applicationTemplateId (TEXT)
|_ appMetadata (TEXT)
|_ appOwnerTenantId (TEXT)
|_ appRoleAssignmentRequired (BOOLEAN)
|_ appRoles (TEXT)
|_ authenticationPolicy (TEXT)
|_ disabledByMicrosoftStatus (TEXT)
|_ displayName (TEXT)
|_ errorUrl (TEXT)
|_ homepage (TEXT)
|_ informationalUrls (TEXT)
|_ keyCredentials (TEXT)
|_ logoutUrl (TEXT)
|_ managedIdentityResourceId (TEXT)
|_ microsoftFirstParty (BOOLEAN)
|_ notificationEmailAddresses (TEXT)
|_ oauth2Permissions (TEXT)
|_ passwordCredentials (TEXT)
|_ preferredSingleSignOnMode (TEXT)
|_ preferredTokenSigningKeyEndDateTime (DATETIME)
|_ preferredTokenSigningKeyThumbprint (TEXT)
|_ publisherName (TEXT)
|_ replyUrls (TEXT)
|_ samlMetadataUrl (TEXT)
|_ samlSingleSignOnSettings (TEXT)
|_ servicePrincipalNames (TEXT)
|_ tags (TEXT)
|_ tokenEncryptionKeyId (TEXT)
|_ servicePrincipalType (TEXT)
|_ useCustomTokenSigningKey (BOOLEAN)
|_ verifiedPublisher (TEXT)
Here is how to get the first row from the ServicePrincipals
table using the ORM.
from roadtools.roadlib.metadef.database import ServicePrincipal
# Connect to the database
session = database.get_session(engine)
# Use ORM to query database
session.query(ServicePrincipal).first().as_dict()
{'objectType': 'ServicePrincipal',
'objectId': 'b2ee120f-7070-434d-821a-163b8ee63c57',
'deletionTimestamp': None,
'accountEnabled': True,
'addIns': [],
'alternativeNames': [],
'appBranding': None,
'appCategory': None,
'appData': None,
'appDisplayName': 'O365Account',
'appId': '1cda9b54-9852-4a5a-96d4-c2ab174f9edf',
'applicationTemplateId': None,
'appMetadata': None,
'appOwnerTenantId': 'f8cdef31-a31e-4b4a-93e4-5f571e91255a',
'appRoleAssignmentRequired': False,
'appRoles': [],
'authenticationPolicy': None,
'disabledByMicrosoftStatus': None,
'displayName': 'O365Account',
'errorUrl': None,
'homepage': None,
'informationalUrls': {'termsOfService': None,
'support': None,
'privacy': None,
'marketing': None},
'keyCredentials': [],
'logoutUrl': None,
'managedIdentityResourceId': None,
'microsoftFirstParty': True,
'notificationEmailAddresses': [],
'oauth2Permissions': [],
'passwordCredentials': [{'customKeyIdentifier': None,
'endDate': '2022-10-21T20:40:45.6027957Z',
'keyId': 'd8824efe-0654-444b-a163-c16d3049380b',
'startDate': '2021-10-21T20:40:45.6027957Z',
'value': None},
{'customKeyIdentifier': None,
'endDate': '2022-10-20T14:40:06.1955755Z',
'keyId': '9f5a00c7-664e-43f7-9f8b-bffed48d9690',
'startDate': '2021-10-20T14:40:06.1955755Z',
'value': None},
{'customKeyIdentifier': None,
'endDate': '2022-10-20T14:32:29.8531814Z',
'keyId': 'cdd04e28-7681-4022-8902-236243c0a085',
'startDate': '2021-10-20T14:32:29.8531814Z',
'value': None}],
'preferredSingleSignOnMode': None,
'preferredTokenSigningKeyEndDateTime': None,
'preferredTokenSigningKeyThumbprint': None,
'publisherName': 'Microsoft Services',
'replyUrls': [],
'samlMetadataUrl': None,
'samlSingleSignOnSettings': None,
'servicePrincipalNames': ['1cda9b54-9852-4a5a-96d4-c2ab174f9edf'],
'tags': [],
'tokenEncryptionKeyId': None,
'servicePrincipalType': 'Application',
'useCustomTokenSigningKey': None,
'verifiedPublisher': {'displayName': None,
'verifiedPublisherId': None,
'addedDateTime': None}}
I recommend the ORM approach if you aren’t comfortable writing SQL queries by hand.
There are also a ton of resources out there about the sqlalchemy
ORM, so it is well-documented and supported.
Using pandas
DataFrames#
You can do a lot with native SQL queries and the sqlalchemy
ORM, but I am most comfortable doing data analysis in Python with the pandas
module. pandas
DataFrames are tabular data structures containing Python objects. The pandas
ecosystem is wildly popular and well-documented, so it provides additional functionality when munging and enriching the Azure AD data collected by ROADtools
.
Here’s a useful snippet to dump all the tables as DataFrames:
import pandas as pd
def table_to_dataframe(
session: sqlalchemy.orm.session.Session,
table: sqlalchemy.sql.schema.Table
) -> pd.DataFrame:
"""Takes a SQLAlchemy Table and returns all the rows
in that table as a `pd.DataFrame`.
"""
print(f"Converting table {table.name} to pd.DataFrame")
rows = session.query(table).all()
if rows:
return pd.json_normalize([row._asdict() for row in rows])
else:
return pd.DataFrame()
tables = {
table.name: table_to_dataframe(session, table)
for table in metadata.sorted_tables
}
Converting table AppRoleAssignments to pd.DataFrame
Converting table ApplicationRefs to pd.DataFrame
Converting table Applications to pd.DataFrame
Converting table AuthorizationPolicys to pd.DataFrame
Converting table Contacts to pd.DataFrame
Converting table Devices to pd.DataFrame
Converting table DirectoryRoles to pd.DataFrame
Converting table DirectorySettings to pd.DataFrame
Converting table ExtensionPropertys to pd.DataFrame
Converting table Groups to pd.DataFrame
Converting table OAuth2PermissionGrants to pd.DataFrame
Converting table Policys to pd.DataFrame
Converting table RoleAssignments to pd.DataFrame
Converting table RoleDefinitions to pd.DataFrame
Converting table ServicePrincipals to pd.DataFrame
Converting table TenantDetails to pd.DataFrame
Converting table Users to pd.DataFrame
Converting table lnk_application_owner_serviceprincipal to pd.DataFrame
Converting table lnk_application_owner_user to pd.DataFrame
Converting table lnk_device_owner to pd.DataFrame
Converting table lnk_group_member_contact to pd.DataFrame
Converting table lnk_group_member_device to pd.DataFrame
Converting table lnk_group_member_group to pd.DataFrame
Converting table lnk_group_member_serviceprincipal to pd.DataFrame
Converting table lnk_group_member_user to pd.DataFrame
Converting table lnk_role_member_group to pd.DataFrame
Converting table lnk_role_member_serviceprincipal to pd.DataFrame
Converting table lnk_role_member_user to pd.DataFrame
Converting table lnk_serviceprincipal_owner_serviceprincipal to pd.DataFrame
Converting table lnk_serviceprincipal_owner_user to pd.DataFrame
You can then access each DataFrame using the table name as the key:
service_principals_df = tables["ServicePrincipals"]
# The `.T` method flips the DataFrame axes.
# This is just for legibility.
service_principals_df.head(1).T
0 | |
---|---|
objectType | ServicePrincipal |
objectId | b2ee120f-7070-434d-821a-163b8ee63c57 |
deletionTimestamp | None |
accountEnabled | True |
addIns | [] |
alternativeNames | [] |
appBranding | None |
appCategory | None |
appData | None |
appDisplayName | O365Account |
appId | 1cda9b54-9852-4a5a-96d4-c2ab174f9edf |
applicationTemplateId | None |
appMetadata | None |
appOwnerTenantId | f8cdef31-a31e-4b4a-93e4-5f571e91255a |
appRoleAssignmentRequired | False |
appRoles | [] |
authenticationPolicy | None |
disabledByMicrosoftStatus | None |
displayName | O365Account |
errorUrl | None |
homepage | None |
informationalUrls | {"termsOfService": null, "support": null, "pri... |
keyCredentials | [] |
logoutUrl | None |
managedIdentityResourceId | None |
microsoftFirstParty | True |
notificationEmailAddresses | [] |
oauth2Permissions | [] |
passwordCredentials | [{"customKeyIdentifier": null, "endDate": "202... |
preferredSingleSignOnMode | None |
preferredTokenSigningKeyEndDateTime | None |
preferredTokenSigningKeyThumbprint | None |
publisherName | Microsoft Services |
replyUrls | [] |
samlMetadataUrl | None |
samlSingleSignOnSettings | None |
servicePrincipalNames | ["1cda9b54-9852-4a5a-96d4-c2ab174f9edf"] |
tags | [] |
tokenEncryptionKeyId | None |
servicePrincipalType | Application |
useCustomTokenSigningKey | None |
verifiedPublisher | {"displayName": null, "verifiedPublisherId": n... |
I recommend the pandas
approach if you are 1) more comfortable with the pandas
API than SQL queries, 2) doing significant transformations on the original Azure AD object, or 3) integrating the Azure AD data with other datasets (especially for machine learning purposes).
In sum, pandas
offers a deep “bag of tricks” to analyze the data collected by ROADtools
. In the following section, I will show one of the neat ways we can use pandas
and ROADtools
to make a useful forensics tool.
Timeline Analysis with ROAD2timeline
#
Tip
Check out my pull request for a new ROADtools
plugin called ROAD2timeline
, which generates a forensic timeline of events (inspired by tools like log2timeline
/plaso
) based on timestamps found in Azure AD objects.
You may have noticed that there are a lot of columns containing timestamps in roadrecon.db
. We can identify these columns by reading the schema from the database:
for table in metadata.sorted_tables:
timestamp_columns = []
for column in table.columns:
if type(column.type) == sqlalchemy.sql.sqltypes.DATETIME:
timestamp_columns.append(column)
if timestamp_columns:
print(f"\n{table.name}")
for column in timestamp_columns:
print(f"|_ {column.name}")
AppRoleAssignments
|_ deletionTimestamp
|_ creationTimestamp
Applications
|_ deletionTimestamp
Contacts
|_ deletionTimestamp
|_ lastDirSyncTime
Devices
|_ deletionTimestamp
|_ approximateLastLogonTimestamp
|_ complianceExpiryTime
|_ lastDirSyncTime
DirectoryRoles
|_ deletionTimestamp
ExtensionPropertys
|_ deletionTimestamp
Groups
|_ deletionTimestamp
|_ createdDateTime
|_ expirationDateTime
|_ lastDirSyncTime
|_ renewedDateTime
OAuth2PermissionGrants
|_ expiryTime
|_ startTime
Policys
|_ deletionTimestamp
RoleDefinitions
|_ deletionTimestamp
ServicePrincipals
|_ deletionTimestamp
|_ preferredTokenSigningKeyEndDateTime
TenantDetails
|_ deletionTimestamp
|_ companyLastDirSyncTime
|_ createdDateTime
Users
|_ deletionTimestamp
|_ acceptedOn
|_ createdDateTime
|_ employeeHireDate
|_ invitedOn
|_ lastDirSyncTime
|_ lastPasswordChangeDateTime
|_ onPremisesPasswordChangeTimestamp
|_ refreshTokensValidFromDateTime
|_ userStateChangedOn
Wouldn’t it be nice if we had a way to generate a timeline from all these timestamps, similar to how
log2timeline
/plaso
works for deadbox forensics?
I wrote a ROADtools
plugin to generate these timelines. For each column containing a timestamp, the plugin makes a copy the relevant table as a pd.DataFrame
using that timestamp column as a datetime index. Then the plugin concatenates all the DataFrames together and sorts by a unified datetime index. The resulting timeline contains all of the original columns from each table, plus a few additional columns prefixed with an underscore.
Users can provide message templates to populate friendly timeline entries that are easy to read. For example, if a service principal was granted consent for an application permission and this message template was provided:
AppRoleAssignments:
creationTimestamp: App role for {resourceDisplayName} ({resourceId}) assigned to {principalType} named {principalDisplayName} ({principalId})
Then the message template would populate a creationTimestamp
timeline entry that looks like this:
+----------------------------+--------------------+-------------------+---------------------------------------------------------------------------------+
| _timestamp | _table_name | _timestamp_column | _message |
+----------------------------+--------------------+-------------------+---------------------------------------------------------------------------------+
| 2021-10-21 20:35:26.819835 | AppRoleAssignments | creationTimestamp | App role for Microsoft Graph (a643a498-b308-48f5-b35f-2fbc51c2dabe) assigned to |
| | | | ServicePrincipal named RC-TDR - Azure AD Audit |
| | | | (0ddc82d7-db6d-46ae-87a6-add8db5a5eba) |
+----------------------------+--------------------+-------------------+---------------------------------------------------------------------------------+
ROAD2timeline
comes with a default message template for most timestamps on Azure AD objects.
The timeline will generate even if a message template is not available for a given Azure AD object and timestamp column. Please check out the PR for more information.
Generate Azure AD Timeline#
Warning
ROAD2timeline
requires additional Python dependencies that are not included in a default installation of ROADtools
.
Ensure that pandas
is installed in your Python environment prior to running ROAD2timeline
.
The pyarrow
module is also required if you want to save the timeline as a parquet
file.
You can use ROAD2timeline
like any other ROADtools
plugin. Check out the --help
flag for command-line options.
roadrecon plugin road2timeline --help
usage: roadrecon plugin road2timeline [-h] [-d DATABASE] [-t TEMPLATE_FILE] [-f OUTPUT_FILE]
Timeline analysis of Azure AD objects
optional arguments:
-h, --help show this help message and exit
-d DATABASE, --database DATABASE
Database file. Can be the local database name for SQLite, or an SQLAlchemy compatible URL such as postgresql+psycopg2://dirkjan@/roadtools
-t TEMPLATE_FILE, --template-file TEMPLATE_FILE
File containing string templates to translate Azure AD objects into a timeline entry. Defaults to `road2timeline.yaml` in the current working directory.
-f OUTPUT_FILE, --output-file OUTPUT_FILE
File to save timeline outputs. Output format determined by file extension. Supported extensions include: [csv, pickle, jsonl, parquet]
Here is how you can run the plugin to generate a timeline of Azure AD objects. This will generate a timeline in parquet
format. Supported output formats include jsonl
, csv
, and pickle
.
roadrecon plugin road2timeline -d roadrecon.db -f timeline.parquet
Timeline entry template file road2timeline.yaml not found, defaulting to built-in templates
Loading timeline entry templates from /workspaces/detect-dot-dev/temp/ROADtools/roadrecon/roadtools/roadrecon/plugins/road2timeline.yaml
Timeline saved to file /workspaces/detect-dot-dev/blog/blog/posts/timeline.parquet
Analyze Azure AD Timeline#
Here is how I would typically review the timeline generated by ROAD2timeline
.
I like to use pd.Grouper
on DataFrames that have a datetime index.
Instead of returning one row for each timeline entry, pd.Grouper
will aggregate multiple timeline entries together based on whatever frequency you pass as the freq
argument, which takes a date offset string.
In the example below, freq="15T"
means that the DataFrame will bucket together all timeline entries that happen within the same 15 minute period.
This has the nice advantage of reducing the overall number of rows in the timeline and makes clusters of related activity stand out.
Play around with different frequencies and combinations of columns to groupby.
timeline_df = pd.read_parquet("timeline.parquet")
timeline_df.groupby([
pd.Grouper(freq="15T"), # Buckets in 15 minute intervals, optional
"_table_name",
"_timestamp_column",
"_message",
])._object_id.nunique().to_frame().head(5) # Truncating for legibility
_object_id | ||||
---|---|---|---|---|
_timestamp | _table_name | _timestamp_column | _message | |
2021-10-12 16:45:00 | TenantDetails | createdDateTime | No template found for TenantDetails.createdDateTime: Object ID - bf016c83-cab4-4919-98b0-ab45e3acd4b5 | 1 |
Users | createdDateTime | User willem@byrgenwerth.onmicrosoft.com was created in the directory (Cloud SID: S-1-12-1-3551930148-1137481302-3611740840-4497412993) | 1 | |
invitedOn | User willem@byrgenwerth.onmicrosoft.com was invited to the directory (Cloud SID: S-1-12-1-3551930148-1137481302-3611740840-4497412993) | 1 | ||
lastPasswordChangeDateTime | Cloud password was changed for user willem@byrgenwerth.onmicrosoft.com (Cloud SID: S-1-12-1-3551930148-1137481302-3611740840-4497412993) | 1 | ||
2021-10-12 18:15:00 | Groups | createdDateTime | Group All Company was created by application 00000005-0000-0ff1-ce00-000000000000 ( Group Object ID: bbc58eb4-ab52-4dd4-a709-136a72d1f040, SID: None) | 1 |
Future Efforts#
ROADtools
is tremendously useful for offensive, defensive, and research purposes.
The information security community should rally behind projects like ROADtools
that make it easier to explore and understand the inner-workings of the Microsoft cloud.
The biggest shortcoming with ROADtools
is its reliance on the legacy Azure AD Graph API, which Microsoft plans to depreciate.
Furthermore, the modern Microsoft Graph API (especially the beta
version) includes a treasure-trove of additional entities beyond Azure AD that are laden with security value.
I believe the basic premise of ROADtools
can be applied to the modern Microsoft Graph as well.
I have been actively researching several approaches that leverage the OpenAPI description of the Microsoft Graph, rather than the OData metadata.
The evolution of ROADtools
will likely involve projects like pydantic
, datamodel-code-generator
, and sqlmodel
to generate Python and SQL bindings for the (comparatively) massive number of objects on the modern Graph.
I look forward to sharing some of my research once I have found the most elegant solution.
Conclusion#
In this post, we covered:
How to inspect Graph API metadata and described how these relate to objects in Azure AD
How
ROADtools
uses the Graph metadata to represent an Azure AD tenant as theroadrecon.db
SQLite databaseSeveral approaches to analyzing the Azure AD data in
roadrecon.db
:Using native SQL features and tools (such as
sqlite-utils
)Using the
sqlalchemy
ORM thatROADtools
providesUsing the
pandas
data analysis library
A proposed
ROADtools
plugin calledROAD2timeline
, which usespandas
to read fromroadrecon.db
and generates a forensic timeline based on the timestamps found in Azure AD objects
If you made it this far, thank you so much for reading!