Class CreateIndexJson
- All Implemented Interfaces:
java.lang.Runnable
,java.util.concurrent.Callable<java.lang.Object>
public class CreateIndexJson extends CollecTorMain
index.json
containing metadata of all
files in the indexed/
directory and update the htdocs/
directory to contain all files to be served via the web server.
File metadata includes:
- Path for downloading this file from the web server.
- Size of the file in bytes.
- Timestamp when the file was last modified.
- Descriptor types as found in
@type
annotations of contained descriptors. - Earliest and latest publication timestamp of contained descriptors.
- SHA-256 digest of the file.
This class maintains its own working directory htdocs/
with
subdirectories like htdocs/archive/
or htdocs/recent/
and
another subdirectory htdocs/index/
. The first two subdirectories
contain (hard) links created and deleted by this class, the third
subdirectory contains the index.json
file in uncompressed and
compressed forms.
The main reason for having the htdocs/
directory is that indexing
a large descriptor file can be time consuming. New or updated files in
indexed/
first need to be indexed before their metadata can be
included in index.json
. Another reason is that files removed from
indexed/
shall still be available for download for a limited period
of time after disappearing from index.json
.
The reason for creating (hard) links in htdocs/
, rather than
copies, is that links do not consume additional disk space. All directories
must be located on the same file system. Storing symbolic links in
htdocs/
would not have worked with replaced or deleted files in the
original directories. Symbolic links in original directories are allowed as
long as they target to the same file system.
This class does not write, modify, or delete any files in the
indexed/
directory. At the same time it does not expect any other
classes to write, modify, or delete contents in the htdocs/
directory.
-
Field Summary
Fields inherited from class org.torproject.metrics.collector.cron.CollecTorMain
config, mapPathDescriptors, SOURCES
-
Constructor Summary
Constructors Constructor Description CreateIndexJson(Configuration configuration)
Initialize this class with the givenconfiguration
. -
Method Summary
Modifier and Type Method Description protected org.torproject.metrics.collector.indexer.IndexerTask
createIndexerTask(java.nio.file.Path fileToIndex)
Create an indexer task for indexing the given file.java.lang.String
module()
Returns the module name for logging purposes.protected java.lang.String
obtainBuildRevision()
Obtain and return the build revision string that was generated during the build process withgit rev-parse --short HEAD
and written tocollector.buildrevision.properties
, or returnnull
if the build revision string cannot be obtained.void
startProcessing()
Run the indexer by (1) adding new files fromindexed/
to the index, (2) adding old files fromhtdocs/
for which only links exist to the index, (3) scheduling new tasks and updating links inhtdocs/
to reflect what's contained in the in-memory index, and (4) writing new uncompressed and compressedindex.json
files to disk.protected void
startProcessing(java.time.Instant now)
Helper method tostartProcessing()
that accepts the current execution time and which is used by tests.protected java.lang.String
syncMarker()
Returns property prefix/infix/postfix for Sync related properties.Methods inherited from class org.torproject.metrics.collector.cron.CollecTorMain
call, checkAvailableSpace, readProcessedFiles, run, syncMapPathsDescriptors, writeProcessedFiles
-
Constructor Details
-
CreateIndexJson
Initialize this class with the givenconfiguration
.- Parameters:
configuration
- Configuration values.
-
-
Method Details
-
module
public java.lang.String module()Description copied from class:CollecTorMain
Returns the module name for logging purposes.- Specified by:
module
in classCollecTorMain
-
syncMarker
protected java.lang.String syncMarker()Description copied from class:CollecTorMain
Returns property prefix/infix/postfix for Sync related properties.- Specified by:
syncMarker
in classCollecTorMain
-
startProcessing
public void startProcessing()Run the indexer by (1) adding new files fromindexed/
to the index, (2) adding old files fromhtdocs/
for which only links exist to the index, (3) scheduling new tasks and updating links inhtdocs/
to reflect what's contained in the in-memory index, and (4) writing new uncompressed and compressedindex.json
files to disk.- Specified by:
startProcessing
in classCollecTorMain
-
startProcessing
protected void startProcessing(java.time.Instant now)Helper method tostartProcessing()
that accepts the current execution time and which is used by tests.- Parameters:
now
- Current execution time.
-
obtainBuildRevision
protected java.lang.String obtainBuildRevision()Obtain and return the build revision string that was generated during the build process withgit rev-parse --short HEAD
and written tocollector.buildrevision.properties
, or returnnull
if the build revision string cannot be obtained.- Returns:
- Build revision string.
-
createIndexerTask
protected org.torproject.metrics.collector.indexer.IndexerTask createIndexerTask(java.nio.file.Path fileToIndex)Create an indexer task for indexing the given file.The reason why this is a separate method is that it can be overriden by tests that don't actually want to index files but instead provide their own index results.
- Parameters:
fileToIndex
- File to index.- Returns:
- Indexer task.
-