Class OverlapAccumulatorKernel¶
Defined in File BatchWindowFunctionProcessing.h
Inheritance Relationships¶
Base Type¶
public ral::cache::distributing_kernel
(Class distributing_kernel)
Class Documentation¶
-
class
ral::batch
::
OverlapAccumulatorKernel
: public ral::cache::distributing_kernel¶ The OverlapAccumulatorKernel assumes three input caches:
”batches”
”preceding_overlaps”
”following_overlaps” It assumes the previous kernel will fill “batches” N cacheData that is sorted and the batches are in order It assumes that preceding_overlaps and following_overlaps will contain N-1 cacheData that corresponds to the preceding and following overlaps copied from the batches The idea is that part of batches[x] will be copied to fill preceding_overlaps[x+1] and part will be copied to fill following_overlaps[x-1] The preceding_overlaps and following_overlaps will contain metadata to indicate if the overlaps are complete or incomplete (DONE_OVERLAP_STATUS or INCOMPLETE_OVERLAP_STATUS) For example, preceding_overlaps[x+1] would be set to INCOMPLETE_OVERLAP_STATUS if batches[x] was not big enough to fulfill the required preceding_value which is defined by the window frame (i.e. ROWS BETWEEN X PRECEDING AND Y FOLLOWING)
The purpose of OverlapAccumulatorKernel is to ensure that all INCOMPLETE_OVERLAP_STATUS overlaps coming from the previous kernel are COMPLETED by copying from other batches. The other purpose is to fill the overlaps of preceding_overlaps[0] and following_overlaps[N] with data that has to come from the neighboring nodes (or make a blank overlap if there is no neighbor) The OverlapAccumulatorKernel will comunicate with other nodes by sending overlap_requests (PRECEDING_REQUEST, FOLLOWING_REQUEST) which when received and responded to as PRECEDING_RESPONSE and FOLLOWING_RESPONSE
Right before outputting, OverlapAccumulatorKernel will combine preceding_overlaps[x], batches[x] and following_overlaps[x] together to make one batch pushed to the output. The following kernel, will then have in one batch with number of rows (preceding_overlaps[x]->num_rows() + batches[x]->num_rows() + following_overlaps[x]->num_rows()), which is the data necessary to procude a batches[x]->num_rows() worth out final output rows.
This kernel uses ConcatenatingCacheDatas a lot to try to reduce and postpone the materialization of data.
Public Functions
-
inline virtual std::string
kernel_name
()¶
-
virtual kstatus
run
() override¶ Executes the batch processing. Loads the data from their input port, and after processing it, the results are stored in their output port.
- Return
kstatus ‘stop’ to halt processing, or ‘proceed’ to continue processing.