预聚合接入
本例展示如何对数据进行预聚合接入
- 本例使用的json如下:
| 属性名 |值 |类型 |是否必需| 默认值|说明 | | ---- |---- |--- | --- |--- |--- | | type |lucene_supervisor|string| 是 | - | 指定接入类型,注意:{ "type": "lucene_supervisor", "dataSchema": { "dataSource": "rollup-test", "parser":{ "type":"string", "parseSpec":{ "format":"json", "dimensionsSpec":{ "dynamicDimension":false, "dimensions":[ {"name":"s|province","type":"string"}, {"name":"s|event","type":"string"} ] }, "timestampSpec":{ "column":"d|sugo_time", "excludeTimeColumn": false, "format":"millis" } } }, "metricsSpec": [{ "type": "thetaSketch", "name": "uid_estimated_count", "fieldName": "s|uid" }], "granularitySpec": { "type": "uniform", "segmentGranularity": "DAY", "queryGranularity": { "type":"period", "period":"P1D" }, "rollup": true, "intervals": null } }, "tuningConfig": { "type":"kafka", "maxRowsInMemory":10000000, "maxRowsPerSegment":20000000, "intermediatePersistPeriod":"PT10M", "buildV9Directly":true, "reportParseExceptions":true }, "ioConfig": { "topic": "rollup_test", "replicas": 1, "taskCount": 1, "taskDuration": "PT300S", "consumerProperties": { "bootstrap.servers": "192.168.0.220:9092,192.168.0.221:9092,192.168.0.222:9092" }, "startDelay": "PT5S", "period": "PT30S", "useEarliestOffset": true, "completionTimeout": "PT1800S", "lateMessageRejectionPeriod": null }, "writerConfig" : { "type" : "lucene", "maxBufferedDocs" : -1, "ramBufferSizeMB" : 16.0, "indexRefreshIntervalSeconds" : 6 } }
lucene_index
也支持预聚合接入| | dataSchema |参见DataSchema|json| 是 | - | 定义表结构和数据粒度| | ioConfig |参见kafkaSupervisorIOConfig|json| 是 | - | 定义数据来源| | tuningConfig |参见kafkaSupervisorTuningConfig|json| 是 | - | 配置Task的优化参数| | writerConfig |参见WriterConfig|json| 是 | - | 配置数据段的写入参数|
特殊参数说明:
dataSchema.parser.parseSpec.dimensionsSpec.dynamicDimension
预聚合不支持动态维接入,故设为false.dataSchema.parser.metricsSpec
指定预聚合的维度和聚合器.dataSchema.parser.metricsSpec.fieldName
指定预聚合的维度(用于统计计数), 应该注意这个维度不应该再出现dimensionsSpec中,所以也就不能用动态维dataSchema.parser.granularitySpec.rollup
设为true表示进行预聚合dataSchema.parser.granularitySpec.queryGranularity.period
指定预聚合的粒度,不可留空,一定要设置