Populate search index¶
AtoM maintains an Elasticsearch search index to provide fast, full-text search results with faceting.
Occasionaly it is necessary to repopulate the Elasticsearch index from the primary database, especially after operations that affect many records, e.g.
- Importing data (CSV or EAD import)
- Moving a large series to a new parent record
- Moving a Fonds or Collection to a different archival institution
- Doing any bulk search and replace operation
Populating the search index requires running a Symfony command line task, which is located in the root directory of the application. By default, running this task will delete the current index, then repopulate and optimize the index - though an option exists that will allow you to update the index in place, without first deleting the existing index. Depending on the number of records in your installation, this task can take a while to run - for production sites, we recommend running this task after regular business hours.
To run the task without any of the available options, enter the following in the command-line interface, in the root AtoM directory:
php symfony search:populate
You can also run php symfony help search:populate
to see the available
options for this command-line task:
The --application
and --env
options should not be used - AtoM
requires the uses of the pre-set defaults for Symfony to be able to execute
the task.
The --slug
option can be used to specify a specifc resource to be
re-indexed. If the target resource is hierarchical (i.e. an
archival description with lower-level children),
the descendant records will also be indexed. If you don’t want lower-level
records to be indexed when using this option, you can also use the
--ignore-descendants
option - in this case, any lower-level records below
the target resource will not be re-indexed as part of the process.
Note
- When used the
--slug
option will ignore any parameters set by the--exclude types
option - The
--ignore-descendants
option will only work when used in conjunction with the--slug
option
The --exclude-types
option can be used if you do not want to re-index
certain record types. When this option is used, the existing index is not
completely flushed - the current part of the index for the excluded types will
be maintained, while other entities will be flushed and re-indexed. This can
be useful if, for example, you have many archival descriptions (which would
take a long time to re-index), but only need to re-index your authority
records and/or other entities - you could then use the command with
--exclude-types="informationobject
and the existing index for descriptions
would be maintained.
Below is a list of the types that can be excluded using this option:
- accession
- actor (i.e. authority record)
- aip (indexed during a DIP upload from Archivematica)
- function
- informationobject (i.e. archival description)
- repository (i.e. archival institution)
- term (such as subjects and places, etc)
You can exclude multiple types at once, by separating them with commas - don’t leave spaces between the commas. For example, to re-index your site but skip indexing terms (such as subject and place access points), authority records, and archival descriptions, you could enter the command like so:
php symfony search:populate --exclude-types="term,actor,informationobject"
The command will indicate at the beginning which types are being re-indexed, and then will output the results of indexing the remaining entities:
You can also use the --show-types
option to display the available index
types prior to indexing. When run, the task will display a list of available
types in your AtoM instance:
Pressing enter will continue and run the search:populate
task (with no
types), or alternatively, you can exit the task by entering CTRL+C
and
then re-enter your parameters, using --exclude-types
as needed.
Finally, the --update
option can be used to update the index in place,
without first deleting the existing index. The process may take slightly
longer, but it can be useful for indexing on production sites, as there is no
downtime for end users - without this option, no records will display in the
search/browse results until indexing has completed.
Example use:
php symfony search:populate --update
See also