public class SortingCollection<T>
extends java.lang.Object
implements java.lang.Iterable<T>
When iterating over the collection, the number of file handles required is numRecordsInCollection/maxRecordsInRam. If this becomes a limiting factor, a file handle cache could be added.
If Snappy DLL is available and snappy.disable system property is not set to true, then Snappy is used to compress temporary files.
| Modifier and Type | Class and Description |
|---|---|
static interface |
SortingCollection.Codec<T>
Client must implement this class, which defines the way in which records are written to and
read from file.
|
| Modifier and Type | Method and Description |
|---|---|
void |
add(T rec) |
void |
cleanup()
Delete any temporary files.
|
void |
doneAdding()
This method can be called after caller is done adding to collection, in order to possibly free
up memory.
|
boolean |
isDestructiveIteration() |
CloseableIterator<T> |
iterator()
Prepare to iterate through the records in order.
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM)
Syntactic sugar around the ctor, to save some typing of type parameters.
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
boolean printRecordSizeSampling)
Syntactic sugar around the ctor, to save some typing of type parameters.
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
boolean printRecordSizeSampling,
java.nio.file.Path... tmpDir)
Syntactic sugar around the ctor, to save some typing of type parameters
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.util.Collection<java.io.File> tmpDirs)
Deprecated.
since 2017-09. Use
newInstanceFromPaths(Class, Codec, Comparator, int, Collection) instead |
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.io.File... tmpDir)
Deprecated.
since 2017-09. Use
newInstance(Class, Codec, Comparator, int, Path...) instead |
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.nio.file.Path... tmpDir)
Syntactic sugar around the ctor, to save some typing of type parameters
|
static <T> SortingCollection<T> |
newInstanceFromPaths(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.util.Collection<java.nio.file.Path> tmpDirs)
Syntactic sugar around the ctor, to save some typing of type parameters
|
void |
setDestructiveIteration(boolean destructiveIteration)
Tell this collection that it is allowed to discard data during iteration in order to reduce memory footprint,
precluding a second iteration.
|
void |
spillToDisk()
Sort the records in memory, write them to a file, and clear the buffer of records in memory.
|
public void add(T rec)
public void doneAdding()
public boolean isDestructiveIteration()
public void setDestructiveIteration(boolean destructiveIteration)
public void spillToDisk()
public CloseableIterator<T> iterator()
iterator in interface java.lang.Iterable<T>public void cleanup()
@Deprecated public static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.io.File... tmpDir)
newInstance(Class, Codec, Comparator, int, Path...) insteadcomponentType - Class of the record to be sorted. Necessary because of Java generic lameness.codec - For writing records to file and reading them back into RAMcomparator - Defines output sort ordermaxRecordsInRAM - how many records to accumulate in memory before spilling to disktmpDir - Where to write files of records that will not fit in RAM@Deprecated public static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.util.Collection<java.io.File> tmpDirs)
newInstanceFromPaths(Class, Codec, Comparator, int, Collection) insteadcomponentType - Class of the record to be sorted. Necessary because of Java generic lameness.codec - For writing records to file and reading them back into RAMcomparator - Defines output sort ordermaxRecordsInRAM - how many records to accumulate in memory before spilling to disktmpDirs - Where to write files of records that will not fit in RAMpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling)
componentType - Class of the record to be sorted. Necessary because of Java generic lameness.codec - For writing records to file and reading them back into RAMcomparator - Defines output sort ordermaxRecordsInRAM - how many records to accumulate in memory before spilling to diskprintRecordSizeSampling - If true record size will be sampled and output at DEBUG log levelpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling, java.nio.file.Path... tmpDir)
componentType - Class of the record to be sorted. Necessary because of Java generic lameness.codec - For writing records to file and reading them back into RAMcomparator - Defines output sort ordermaxRecordsInRAM - how many records to accumulate in memory before spilling to diskprintRecordSizeSampling - If true record size will be sampled and output at DEBUG log leveltmpDir - Where to write files of records that will not fit in RAMpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM)
componentType - Class of the record to be sorted. Necessary because of Java generic lameness.codec - For writing records to file and reading them back into RAMcomparator - Defines output sort ordermaxRecordsInRAM - how many records to accumulate in memory before spilling to diskpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.nio.file.Path... tmpDir)
componentType - Class of the record to be sorted. Necessary because of Java generic lameness.codec - For writing records to file and reading them back into RAMcomparator - Defines output sort ordermaxRecordsInRAM - how many records to accumulate in memory before spilling to disktmpDir - Where to write files of records that will not fit in RAMpublic static <T> SortingCollection<T> newInstanceFromPaths(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.util.Collection<java.nio.file.Path> tmpDirs)
componentType - Class of the record to be sorted. Necessary because of Java generic lameness.codec - For writing records to file and reading them back into RAMcomparator - Defines output sort ordermaxRecordsInRAM - how many records to accumulate in memory before spilling to disktmpDirs - Where to write files of records that will not fit in RAM