Class OverlapAccumulatorKernel¶

Defined in File BatchWindowFunctionProcessing.h

Inheritance Relationships¶

Base Type¶

public ral::cache::distributing_kernel (Class distributing_kernel)

Class Documentation¶

class ral::batch::OverlapAccumulatorKernel : public ral::cache::distributing_kernel ¶

The OverlapAccumulatorKernel assumes three input caches:

”batches”
”preceding_overlaps”
”following_overlaps” It assumes the previous kernel will fill “batches” N cacheData that is sorted and the batches are in order It assumes that preceding_overlaps and following_overlaps will contain N-1 cacheData that corresponds to the preceding and following overlaps copied from the batches The idea is that part of batches[x] will be copied to fill preceding_overlaps[x+1] and part will be copied to fill following_overlaps[x-1] The preceding_overlaps and following_overlaps will contain metadata to indicate if the overlaps are complete or incomplete (DONE_OVERLAP_STATUS or INCOMPLETE_OVERLAP_STATUS) For example, preceding_overlaps[x+1] would be set to INCOMPLETE_OVERLAP_STATUS if batches[x] was not big enough to fulfill the required preceding_value which is defined by the window frame (i.e. ROWS BETWEEN X PRECEDING AND Y FOLLOWING)

The purpose of OverlapAccumulatorKernel is to ensure that all INCOMPLETE_OVERLAP_STATUS overlaps coming from the previous kernel are COMPLETED by copying from other batches. The other purpose is to fill the overlaps of preceding_overlaps[0] and following_overlaps[N] with data that has to come from the neighboring nodes (or make a blank overlap if there is no neighbor) The OverlapAccumulatorKernel will comunicate with other nodes by sending overlap_requests (PRECEDING_REQUEST, FOLLOWING_REQUEST) which when received and responded to as PRECEDING_RESPONSE and FOLLOWING_RESPONSE

Right before outputting, OverlapAccumulatorKernel will combine preceding_overlaps[x], batches[x] and following_overlaps[x] together to make one batch pushed to the output. The following kernel, will then have in one batch with number of rows (preceding_overlaps[x]->num_rows() + batches[x]->num_rows() + following_overlaps[x]->num_rows()), which is the data necessary to procude a batches[x]->num_rows() worth out final output rows.

This kernel uses ConcatenatingCacheDatas a lot to try to reduce and postpone the materialization of data.

Public Functions

OverlapAccumulatorKernel(std::size_t kernel_id, const std::string &queryString, std::shared_ptr<Context> context, std::shared_ptr<ral::cache::graph> query_graph)¶

inline virtual std::string kernel_name()¶

ral::execution::task_result do_process(std::vector<std::unique_ptr<ral::frame::BlazingTable>> inputs, std::shared_ptr<ral::cache::CacheMachine> output, cudaStream_t stream, const std::map<std::string, std::string> &args) override¶

virtual kstatus run() override¶

Executes the batch processing. Loads the data from their input port, and after processing it, the results are stored in their output port.

Return: kstatus ‘stop’ to halt processing, or ‘proceed’ to continue processing.

Class OutputKernel Class OverlapGeneratorKernel

BlazingSQL v0.19 documentation

Class OverlapAccumulatorKernel¶

Inheritance Relationships¶

Base Type¶

Class Documentation¶