- Windows Search Indexer [Microsoft Documentation] is a service which enables faster searching of files, emails, and other content on Windows systems. The service builds an index that the system refers to whenever a search is run.
- Using the Windows Search index, investigators obtain important data about indexed files and user activity, including:
- File metadata
- Limited file contents
- User interaction with files
- URLs accessed
- This is a useful artifact for Incident Response investigations or for Digital Forensics cases involving intellectual property theft and departed employees.
- Microsoft changed the structure of the Search index in Windows 11, dropping the former ESE database structure and implementing SQLite in its place. Stroz Friedberg’s research into both the old and new structures has revealed how the information in the Windows 10 Search index is mapped in the Windows 11 Search index.
- Stroz Friedberg has released an open-source CLI tool named SIDR that can be used to analyze the Windows Search index at scale—especially when working with a large database or with a large number of systems.
Windows Search Indexer is a service that records information about files and data types in select directories and enables users to search for these files using the Start Menu and Windows Explorer. Like many features of Windows, Search Indexer was created to enhance the user experience. For Digital Forensics and Incident Response (“DFIR”) practitioners, this service generates valuable information that can be useful during an investigation.
This blog post will cover how data is stored in the Search index in Windows 10 and prior versions, and how it has changed in Windows 11. It will also cover how this data can be useful from a DFIR perspective, and how Stroz Friedberg’s tool, Search Index Database Reporter (“SIDR”), can help gather insights from the Search index for the purpose of DFIR investigations.
Structure of the Windows Search Index
In all versions of Windows except Windows Server, Search Indexer recursively indexes every file and folder present within the following directories by default:
- The AppData directory is excluded from indexing.
- C:\ProgramData\Microsoft\Windows\Start Menu\Programs\*
Users can change the default configuration [Microsoft Dev Blogs] and choose which locations are indexed. Search Indexer also indexes URLs accessed using Internet Explorer and Edge, as well as user activity related to some programs, such as WordPad, Notepad, and Excel.
The following sections will describe the structure of the Search index in detail.
Windows 10 and Earlier (Windows.edb)
Beginning from Windows Vista until Windows 10, Windows stores the Search index inside an Extensible Storage Engine (ESE) [Microsoft Documentation] database located at
C:\ProgramData\Microsoft\Search\Data\Applications\Windows\Windows.edb. For Windows Server 2008 until Windows Server 2022, Stroz Friedberg observed that the database was structured the same way, but that Search Indexer was not enabled by default. The service is enabled by default on non-Server Windows versions.
Windows.edb contains several tables, three of which provide the most value to investigators:
This table contains metadata of every file and folder indexed by Search Indexer. Stroz Friedberg identified the following columns as those typically most useful to an investigator:
|ScopeID||An integer that can be used to determine the record’s parent folder. This ID is also referenced in the
|DocumentID||An integer assigned to every file and folder. Assignment of this ID occurs sequentially as files are created. This ID is also referenced in the
|SDID||A Security Descriptor ID [Microsoft Documentation] that contains information about file ownership and access control.|
|LastModified||Last modified timestamp of the record, stored in Windows File Time format.|
|FileName||Name of the file or folder.|
This table contains parent folders of the files indexed in the SystemIndex_Gthr table:
|Scope||An integer assigned to every folder. This can be correlated with ScopeID from SystemIndex_Gthr.|
|Name||The name of the folder.|
This table contains additional attributes about the indexed files and folders, including the following columns of interest:
|WorkID||An integer assigned to the record. Maps to
|System_Search_GatherTime||The time at which the record was indexed in the database, stored in Windows File Time format.|
|System_Size||The size of the file in bytes.|
|System_ModifiedTime||The $FN last modified time of the record, stored in Windows File Time format.|
|System_CreatedTime||The $FN creation time of the record, stored in Windows File Time format.|
|System_FileOwner||User who created the file, stored as username.|
|System_ItemPathDisplay||Full path of the record.|
|System_ItemType||File type of the record based on the extension of the file. If a file does not have an extension, the value will be “.”.|
|System_FileAttributes||Windows file attributes. [Microsoft Documentation]|
|System_Search_AutoSummary||Partial contents of the file. Stroz Friedberg was unable to determine a consistent rule for how many bytes were recorded in this property in Windows 10. See further sections for more information on AutoSummary.|
The screenshot below illustrates a sample from the Windows 10 SystemIndex_PropertyStore table when viewed with ESEDatabaseView [Nirsoft]. The highlighted record shows an example of a text file where partial content was indexed by the service.
Changes in Windows 11
In Windows 11, this data is stored in the same directory, but the single ESE database is replaced by SQLite database files called Windows.db and Windows-gather.db, discussed in further detail below. Because Windows-usn.db, a third database associated with the Search index on Windows 11, has less forensic value, it is not covered in this post.
The SystemIndex_Gthr and SystemIndex_GthrPth tables from the Windows 10 ESE database were placed in Windows-gather.db, and the content of SystemIndex_PropertyStore was placed into a table named SystemIndex_1_PropertyStore in Windows.db. The graphic below illustrates this change.
Despite having a similar name, the new SystemIndex_1_PropertyStore table in Windows 11 is structured differently from its Windows 10 counterpart. Rather than having multiple columns for each file property, the property values are stored as individual rows. The properties are stored as Property IDs, which are mapped to their names in the SystemIndex_1_PropertyStore_Metadata table.
The following graphic illustrates the relationship between SystemIndex_1_PropertyStore_Metadata and SystemIndex_1_PropertyStore in Windows.db.
According to this graphic, 13 is the Property ID for
System_IsFolder. Therefore, we conclude that the record with
WorkId set to 1 is a folder.
To perform meaningful analysis of these properties, investigators will need to design queries with table joins.
Forensic Value of the Windows Search Index
Stroz Friedberg’s testing has generated valuable insights on the information indexed in the SystemIndex_PropertyStore in Windows 10 and SystemIndex_1_PropertyStore in Windows 11 tables. The sections below outline the value this table can provide to DFIR practitioners. The data was structured differently in Windows 10 and Windows 11; however, Stroz Friedberg observed no relevant differences in the content of the data that was indexed.
File and Folder Existence, Metadata, and Contents
Stroz Friedberg observed that file creation in any of the indexed directories triggers Windows to create a record for that file in the SystemIndex_PropertyStore and SystemIndex_1_PropertyStore tables. File deletion triggers a reindexing, which is reflected as a mark-for-deletion in the database. The discussion of recovering deleted records continues in the “Deleted Records” section below.
Records indexed in the SystemIndex_PropertyStore and SystemIndex_1_PropertyStore tables will contain metadata such as modified, accessed, and created timestamps, the full file path, and the file owner. This is an important source of information to review for threat actor activity, as
C:\Users\* is a common location for threat actors to drop malware and stage files. Malicious startup tasks used by threat actors as a persistence mechanism can also be found in
C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Start Up\ and therefore indexed in the SystemIndex_PropertyStore and SystemIndex_1_PropertyStore tables as well.
The time at which the file was indexed is stored in the
System_Search_GatherTime property. If the indexed creation time varies from the indexed gather time of the target file, this suggests potential time stamp manipulation.
AutoSummary – Obtain Partial File Contents
Possibly one of the most valuable features of the Windows Search Indexer service is AutoSummary. AutoSummary enables the user to search for a file using the contents of the file. For our purposes, AutoSummary allows the investigator to obtain limited contents of files with select file extensions. Stroz Friedberg’s testing has shown that, in most cases, the file extension must match the file type for Auto Summary to record the contents of the file. In other words, for AutoSummary to record the contents of the file, it must be able to correctly parse the text of the recorded file. Therefore, if the file extension is .pdf but the file is a plaintext file, then AutoSummary will parse it incorrectly.
AutoSummary can still parse plaintext files even in the case of a file extension mismatch if the file extension is also that of a plaintext file. For example, AutoSummary will be able to parse the contents of a .txt file renamed to have a .bat extension.
The file contents indexed by the Search Indexer service can be found in the
System_Search_AutoSummary field in the SystemIndex_PropertyStore and SystemIndex_1_PropertyStore tables. Stroz Friedberg’s testing found that, in Windows 11, partial file contents (up to first 1024 bytes) may be found within this property.
Stroz Friedberg has confirmed that, by default, Search Indexer will record contents from the following file types in the AutoSummary property:
- Active Server Page (.asp)
- Batch files (.bat)
- Command files (.cmd)
- Email files (.eml)
- Excel files (.xlsx)
- HTML files (.html)
- Configuration files (.ini)
- OneNote files (.one)
- PDF files (.pdf)
- Registry files (.reg)
- SQLite files (.sql)
- Text files (.txt)
- Visual Basic files (.vbs)
- Word files (.docx)
- XML files(.xml)
- Zip files (.zip) – Windows 10 only
In Windows 10, investigators can carve the slack space of Windows.edb to try and recover indexed records of deleted files. Tools like WinSearchDBAnalyzer [GitHub] leverage the Windows Search platform API to recover deleted records. Without these tools, carving and keyword searching may be difficult due to the database encoding [University of York] and frequent restructuring [Forensic Focus].
In Windows 11, any changes made to the Windows.db and Windows-gather.db databases will be first written to the write-ahead logs, Windows.db-wal and Windows-gather.db-wal, which are temporary records of recent changes also known as WAL [SQLite Documentation] files. When a user deletes a file, the file record will be available in the main database until the changes in the WAL are written to the database upon reboot or checkpoint [SQLite Documentation]. This applies to all changes made by the Search Indexer service, including file creation and file renaming. This gives investigators a window of opportunity where the original database may still contain records of items that have been deleted and highlights the importance of getting a forensic image of the current state of a system. Investigators should keep in mind, however, that data found in slack space may not always be intact and may exist without additional context.
IE/Edge Browsing Activity
Stroz Friedberg found that Search index also records all URLs that a user accesses via Internet Explorer and Edge, except those viewed using Private Browsing Mode. In Windows 10 and earlier, visited URLs are stored in the SystemIndex_PropertyStore table in one or more of the following three fields. Stroz Friedberg found that several factors affected how the URL was stored, including whether the URL was valid, and whether the user was connected to the Internet at time of access.
|ItemPathDisplay||For valid URLs accessed with an active Internet connection.|
|Activity_ContentUri||For invalid URLS accessed with or without an active Internet connection.|
|Activity_Description||For invalid URLS accessed with or without an active Internet connection.|
In Windows 11, visited URLs are stored in the
System_Link_TargetURL property of the SystemIndex_1_PropertyStore table. In Windows 10 and Windows 11 some invalid URLs may not be indexed.
If the user deletes their browser history, the visited URLs may still be present in the Windows.db file in Windows 11 and can potentially be carved from the Windows.edb file in Windows 10.
User Activity Logging
Another useful feature built into the Search Indexer service is called ActivityHistory. This feature tracks file opening on a per-user basis, which can provide investigators with evidence of file knowledge and user account attribution for suspicious activity such as a user opening files using select programs.
ActivityHistory records will contain the string “ActivityHistoryItem” in the
When a user opens any file in an indexed location, several attributes are recorded in the ActivityHistory for the session, including:
|System_ItemPathDisplay||Full path of the opened file, which will include the SID of the user who interacted with the file.|
|ActivityHistory_StartTime and ActivityHistory_EndTime||Start and end times of the activity, which correspond to time of file open/close.|
|ActivityHistory_AppId||Program used to open the file.|
Stroz Friedberg’s testing has shown that, when a file is deleted or renamed, Search Indexer does not delete any ActivityHistory records associated with that file. Therefore, investigators can use ActivityHistory to attribute file knowledge even when a user deletes or renames a file. Stroz Friedberg is still investigating the size limit of the database and the retention period for ActivityHistory of renamed or deleted files.
Stroz Friedberg observed tracking of the following activity by the Search Indexer service. Further testing may reveal additional tracked activity:
- Opening text files (.txt, .bat, .xml, and .js) with the following programs:
- Internet Explorer (except .bat files)
- Edge browser (except .bat files)
- Opening PDF files with the following programs:
- Edge browser
- FireFox browser
- Chrome browser
- Adobe Acrobat Reader
- Opening Excel files (.xls and .xlsx) files with Microsoft Excel
- Opening Document (.doc and .docx) files with Microsoft Word
- Opening PowerPoint (.ppt and .pptx) files with Microsoft PowerPoint
- Opening image files (.bmp, .png, and .jpg) with Microsoft Photos
Stroz Friedberg observed that activity from PowerShell and 7zip was not tracked. Stroz Friedberg also observed that Search Indexer will record file-specific metadata for other file types, such as subject and recipient in email message files (.eml and .msg).
Reviewing the Content of the Search Index Database
Stroz Friedberg reviewed the Search index on Windows 10 22H2 and Windows 11 22H2 after performing actions on the system such as creating, renaming, and deleting files. The following tools were used to view the Search index:
These tools provide a graphical user interface for the index that can be useful for investigators when looking at a handful of systems, but they do not scale well for investigations that involve large enterprise networks with hundreds or thousands of systems. Additionally, the size of the index can get very large depending on the indexing configuration. In this case, the tools above may struggle to process the entire index. Stroz Friedberg recognizes these challenges and has created a CLI tool called SIDR that can process the Search index in both Windows 10 and Windows 11 at scale. The tool provides the following reports which summarize the most valuable information available in the Search Index.
Windows Search Index File Report
The File Report will contain a list of all the files present in the indexed locations along with metadata such as the full file path, MAC timestamps in UTC, and the file owner.
Windows Search Index Internet Explorer and Edge History Report
The Internet Explorer and Edge History Report will contain URL browsing activity.
Windows Search Index ActivityHistory Report
The ActivityHistory report will contain user account-attributed file access activity.
The Windows Search index is a lesser-known artifact that, if analyzed properly, can supplement analysis for investigations that involve fragmented evidence and defense evasion tactics. By providing user attribution and records of files created on the system, the index serves as an important source of evidence for not only threat actor activity in incident response investigations, but also suspicious user activity for insider threat and intellectual property cases. Stroz Friedberg is proud to release this research and scalable CLI tool to help the DFIR community respond faster and more accurately in future investigations.
Authors: Phalgun Kulkarni, Julia Paluch
Special Thanks: Partha Alwar
April 26, 2023
©Aon plc 2023