Record Builder
Writing a RecordBuilder
RecordBuilders encapsulate the smallest units of aggregation logic required to generate records for a plugin.
They define two methods: aggregate() which builds the actual DataTable & numeric records to insert into archive tables,
and getRecordMetadata() which returns information about what records the RecordBuilder builds.
aggregate() will generally aggregate data from log tables to create records, but it does not have to. An example of a use case
without aggregation would be importing analytics data from another service.
getRecordMetadata() is used when aggregating records for non-day periods. In this case, Matomo will find the record values
for the subperiods of the non-day period and aggregate them together.
If your plugin needs to insert data into the archive tables during archiving, then you'll want to create your own RecordBuilder classes.
This guide describes how to do that.
How to create one
Step one: identify the list of records and log aggregation queries you want to bundle together
Log aggregation queries are expensive (especially with segmentation) and Matomo wants to be able to run as few of them
as possible at a time. A RecordBuilder is meant to encapsulate the smallest amount of archiving logic possible, allowing Matomo
to run just what it needs to.
Many times this will either be running a single log aggregation query to generate a single DataTable or running a single
log aggregation query to generate multiple numeric metrics. Sometimes it will mean running multiple log aggregation queries
to generate a single DataTable or running multiple log aggregation queries to generate multiple DataTables and multiple metrics.
It is up to you as a developer to find the balance between efficiency (executing the fewest log aggregation queries overall)
and modularity (having RecordBuilders that individually do as little as possible).
Once you've identified the RecordBuilders you'll need, create empty classes for them in a RecordBuilders subfolder of your plugin. For example,
/path/to/matomo/plugins/MyPlugin/RecordBuilders/MyRecordBuilder.
A note about Parameterized RecordBuilders
RecordBuilders that can be created without specifying constructor arguments (as in, are default constructable)
are found and created automatically by Matomo. But it is also possible to create RecordBuilders that require
parameters. These RecordBuilders are added via the Archiver.addRecordBuilders event.
The ability to create parameterized RecordBuilders may not be necessary in most cases, but if your plugin
manages entities and provides reports about those entities, it can be used to avoid having to run a query for
every entity in the database within a single RecordBuilder.
Examples of plugins that use this feature are the Custom Reports premium feature and the A/B Testing premium feature.
Each of these plugins use a RecordBuilder that takes an ID. For Custom Reports this is the ID of the specific custom
report and for A/B Testing this is the ID of the experiment.
Step two: implement getRecordMetadata()
Once you know what queries the RecordBuilders you are going to create will execute, you can start coding.
The first thing to do is implement the getRecordMetadata() method.
All this method does is return a list of Record entries describing the records the builder will create:
use Piwik\ArchiveProcessor\Record;
public function getRecordMetadata(ArchiveProcessor $archiveProcessor): array
{
return [
Record::make(Record::TYPE_BLOB, 'MyPlugin_myRecord'),
Record::make(Record::TYPE_NUMERIC, 'MyPlugin_myMetric'),
...
];
}
The above is a typical example of how this method would be implemented, but it doesn't have to just be a hard-coded array.
You can use the ArchiveProcessor to get the current site/period/segment or fetch system settings or measurable
settings and vary the result based on that information. The only requirement is that every Record returned matches
what can be returned by the aggregate() method, which we'll look at next.
Step three: implement aggregate()
The next step is to implement your actual log aggregation logic in the aggregate() method. This method accepts
an ArchiveProcessor and returns an array mapping record names with record values to insert. Record values are
either numeric metric values or DataTable instances, which get serialized and inserted as blobs.
As for how they are created, well, there is no straightforward way to define how log aggregation is done.
The current pattern in Matomo is to use the core LogAggregator class to query log data and loop through the result.
If your plugin provides its own additional log tables, then the pattern is to define your own Aggregator classes
to build and execute log aggregation SQL queries, and use those classes in your RecordBuilders.
An example of this might look like:
public function aggregate(ArchiveProcessor $archiveProcessor): array
{
$logAggregator = $archiveProcessor->getLogAggregator();
$report = new DataTable();
$query = $logAggregator->queryVisitsByDimension(['label' => 'config_browser_name']);
while ($row = $query->fetch()) {
$columns = [
Metrics::INDEX_NB_UNIQ_VISITORS => $row[Metrics::INDEX_NB_UNIQ_VISITORS],
Metrics::INDEX_NB_VISITS => $row[Metrics::INDEX_NB_VISITS],
Metrics::INDEX_NB_ACTIONS => $row[Metrics::INDEX_NB_ACTIONS],
Metrics::INDEX_NB_USERS => $row[Metrics::INDEX_NB_USERS],
Metrics::INDEX_MAX_ACTIONS => $row[Metrics::INDEX_MAX_ACTIONS],
Metrics::INDEX_SUM_VISIT_LENGTH => $row[Metrics::INDEX_SUM_VISIT_LENGTH],
Metrics::INDEX_BOUNCE_COUNT => $row[Metrics::INDEX_BOUNCE_COUNT],
Metrics::INDEX_NB_VISITS_CONVERTED => $row[Metrics::INDEX_NB_VISITS_CONVERTED],
];
$report->sumRowWithLabel($row['label'] ?? '', $columns);
}
return [
'MyPlugin_myRecord' => $report,
'MyPlugin_myMetric' => $report->getRowsCount(),
];
}
This example queries the log_visit table, grouping by the config_browser_name column and aggregating visit metrics.
Then, for each row of that query, it adds the metrics to a DataTable which is eventually returned.
Most aggregate() methods will be more complicated than this, but hopefully it provides you with a general understanding
of how they should work. We recommend looking at existing RecordBuilders in Matomo as well to see what is possible.
Step four: decide whether you need to set custom row limits or aggregation operations
At this point, the hard parts are over. The last two steps are just finishing touches.
By default, Matomo does not limit the data that is inserted into archive tables. For reports that have a limited number
of rows, like the VisitorInterest.getVisitsByVisitCount and UserCountry.getCountry, this is acceptable. But for reports
with a variable number of rows, it's good practice to make sure the number of rows is capped.
To set a limit, set the maxRowsInTable and maxRowsInSubtable properties in the constructor of your RecordBuilder.
This can be hard-coded or it can come from configuration:
class MyRecordBuilder extends RecordBuilder
{
public function __construct()
{
parent::__construct();
$this->maxRowsInTable = (int)Config::getInstance()->MyPlugin['datatable_archiving_maximum_rows'];
$this->maxRowsInSubtable = (int)Config::getInstance()->MyPlugin['datatable_archiving_maximum_rows_subtable'];
// we want to sort by the most important metric in our reports before we cut off rows
$this->columnToSortByBeforeTruncation = Metrics::INDEX_NB_VISITS;
}
}
If you don't know what to use, you can set both values to Config::getInstance()->General['datatable_archiving_maximum_rows_standard'].
Also note we set columnToSortByBeforeTruncation to make sure the rows with the least visits are the ones that get removed.
Additionally, if your plugin provides metrics that should be aggregated together with an operation other than being sum-ed,
you will need to set the $columnAggregationOps property:
class MyRecordBuilder extends RecordBuilder
{
public function __construct()
{
parent::__construct();
// ...
$this->columnAggregationOps = [
'my_max_metric' => 'max',
'my_min_metric' => 'min',
'my_other_metric' => function ($thisValue, $otherValue, $thisRow, $otherRow) {
// custom aggregation logic here
},
];
}
}
Note that each of these settings can also be overridden for specific records by setting the relevant property
on Record instances in your getRecordMetadata() method.
Step five: if your RecordBuilder is parameterized, implement the relevant event
If your RecordBuilder is not parameterized then there's nothing else to do. You're done and Matomo will detect and use it.
If it is parameterized, then there's still one thing left to do. Matomo will not be able to automatically create a RecordBuilder
that takes parameters, so it must be added manually in the Archiver.addRecordBuilders event like so:
class MyPlugin
{
public function registerEvents()
{
$hooks = [
'Archiver.addRecordBuilders' => 'addRecordBuilders',
];
return $hooks;
}
public function addRecordBuilders(array &$recordBuilders): void
{
$idSite = \Piwik\Request::fromRequest()->getIntegerParameter('idSite', 0);
if (!$idSite) {
return;
}
$entities = StaticContainer::get(MyEntityDao::class)->getAllEntitiesForSite($idSite);
foreach ($entities as $entity) {
$recordBuilders[] = new MyRecordBuilder($entity);
}
}
}
Here we create a RecordBuilder instance for every entity our plugin manages.
And that's it, your RecordBuilder is done.
Advanced
Overriding non-day period aggregation
Archiving for non-day periods is handled by the buildForNonDayPeriod() method, which
will use record metadata to query and aggregate records for the requested period's subperiods.
Normally, when creating a RecordBuilder, you will not need to interact with it. But, in
some rare cases, the default behavior of aggregating subperiods will not be enough.
In this case, it is perfectly acceptable to override the buildForNonDayPeriod() method
and implement your own logic.
If doing so, keep the following in mind:
when querying for records of subperiods, do not query fetch all of them in memory at once. Record data can take up a significant amount of memory, and querying all the data at once here can cause out of memory errors for the archiving process. Instead, use a method like
Archive::querySingleBlob()which uses a cursor.insert blob records via the
RecordBuilder::insertBlobRecord()method. For numeric records, useArchiveProcessor::insertNumericRecords().