Skip to content

[Fix](Query Stats) Add QueryStatsRecorder for column-level query and filter - Part2#63768

Open
nsivarajan wants to merge 2 commits into
apache:masterfrom
nsivarajan:fix-query-filter-stats-part-2
Open

[Fix](Query Stats) Add QueryStatsRecorder for column-level query and filter - Part2#63768
nsivarajan wants to merge 2 commits into
apache:masterfrom
nsivarajan:fix-query-filter-stats-part-2

Conversation

@nsivarajan
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #63067

Problem Summary:

PR is a Follow-up of #63067 , Extends column-level query/filter hit recording to cover all major Nereids physical plan constructs beyond the base PhysicalOlapScan:

  • Alias resolution: SELECT k1 AS name records k1.queryHit
  • GROUP BY keys: GROUP BY k1 records k1.queryHit
  • Aggregate input columns: SUM(k2) records k2.queryHit
  • ORDER BY columns: ORDER BY k2 records k2.queryHit
  • Window PARTITION BY / ORDER BY keys
  • Window value columns: SUM(k2) OVER (...) records k2.queryHit
  • JOIN ON conditions (hash + non-equi): records filterHit on both sides
  • ROLLUP/CUBE grouping sets via PhysicalRepeat
  • PartitionTopN partition and order keys (ROW_NUMBER per-partition)
  • Storage-layer aggregate pushdown: COUNT(*)/MIN/MAX queries record stats
  • Lazy materialization scan slot remapping via row-id lookup

Out of scope (tracked for Part 3):

The following cases are intentionally deferred and not bugs in this PR:

  • UNION / INTERSECT / EXCEPT — set operation output slots are not yet remapped to child scans
  • CTE consumer columns — consumer-side slot IDs differ from producer scan slots
  • LATERAL VIEW / EXPLODE — generator output slots are not yet remapped
  • HAVING SUM(k2) > 0 — aggregate output predicates; simple HAVING k1 > 0 already works
  • External tables (Hive / Iceberg / JDBC) — deferred, requires separate design

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@nsivarajan
Copy link
Copy Markdown
Contributor Author

run buildall

1 similar comment
@nsivarajan
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31495 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 162bfea804bb7ee749f75ef9886d46744c40fc0c, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17812	4025	3902	3902
q2	q3	10841	1379	797	797
q4	4681	471	339	339
q5	7607	2230	2059	2059
q6	238	172	147	147
q7	925	784	622	622
q8	9359	1642	1621	1621
q9	6396	4887	4939	4887
q10	6426	2193	1869	1869
q11	438	268	246	246
q12	702	433	295	295
q13	18180	3299	2794	2794
q14	263	254	239	239
q15	q16	811	760	701	701
q17	912	992	906	906
q18	6710	5857	5682	5682
q19	1161	1296	1158	1158
q20	535	422	286	286
q21	5888	2728	2634	2634
q22	433	367	311	311
Total cold run time: 100318 ms
Total hot run time: 31495 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4749	4731	4944	4731
q2	q3	4877	5328	4582	4582
q4	2110	2178	1395	1395
q5	4782	4767	4731	4731
q6	239	183	132	132
q7	1841	1729	1557	1557
q8	2403	1958	1892	1892
q9	7405	7342	7328	7328
q10	4752	4669	4212	4212
q11	527	381	349	349
q12	713	744	524	524
q13	3010	3379	2778	2778
q14	275	280	247	247
q15	q16	678	692	605	605
q17	1259	1240	1237	1237
q18	7194	6960	6783	6783
q19	1085	1068	1111	1068
q20	2226	2227	1956	1956
q21	5295	4606	4413	4413
q22	528	473	433	433
Total cold run time: 55948 ms
Total hot run time: 50953 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170673 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 162bfea804bb7ee749f75ef9886d46744c40fc0c, data reload: false

query5	4310	648	525	525
query6	346	231	207	207
query7	4265	580	309	309
query8	316	230	214	214
query9	8778	4024	3996	3996
query10	449	339	294	294
query11	5734	2436	2253	2253
query12	179	136	124	124
query13	1282	596	418	418
query14	6023	5440	5162	5162
query14_1	4472	4478	4442	4442
query15	212	207	185	185
query16	990	463	453	453
query17	1132	727	613	613
query18	2468	517	365	365
query19	215	215	170	170
query20	139	132	134	132
query21	220	141	116	116
query22	13687	13583	13360	13360
query23	17299	16530	16132	16132
query23_1	16200	16490	16359	16359
query24	7491	1770	1309	1309
query24_1	1334	1344	1312	1312
query25	575	489	444	444
query26	1327	325	181	181
query27	2673	532	348	348
query28	4470	2011	2021	2011
query29	1008	654	516	516
query30	308	242	199	199
query31	1120	1121	954	954
query32	89	80	71	71
query33	594	355	301	301
query34	1386	1125	668	668
query35	790	799	694	694
query36	1399	1390	1218	1218
query37	155	98	87	87
query38	3210	3154	3053	3053
query39	919	920	903	903
query39_1	878	872	868	868
query40	228	144	123	123
query41	66	64	63	63
query42	117	109	107	107
query43	328	333	292	292
query44	
query45	221	214	199	199
query46	1128	1188	723	723
query47	2365	2392	2244	2244
query48	404	405	299	299
query49	623	495	390	390
query50	958	350	240	240
query51	4373	4366	4266	4266
query52	105	105	93	93
query53	249	281	204	204
query54	305	280	258	258
query55	106	88	85	85
query56	311	299	301	299
query57	1448	1391	1333	1333
query58	299	267	264	264
query59	1575	1673	1472	1472
query60	320	324	311	311
query61	162	181	158	158
query62	689	650	593	593
query63	240	200	203	200
query64	2430	809	622	622
query65	
query66	1713	473	357	357
query67	30037	29690	29522	29522
query68	
query69	455	346	304	304
query70	996	989	949	949
query71	313	275	255	255
query72	2964	2646	2384	2384
query73	869	782	423	423
query74	5111	4961	4780	4780
query75	2668	2623	2246	2246
query76	2288	1147	765	765
query77	402	441	327	327
query78	12425	12582	11893	11893
query79	1405	1034	727	727
query80	638	534	445	445
query81	449	280	239	239
query82	1378	155	123	123
query83	359	285	248	248
query84	261	145	114	114
query85	895	536	447	447
query86	409	320	320	320
query87	3493	3382	3201	3201
query88	3637	2715	2733	2715
query89	466	392	340	340
query90	1987	184	179	179
query91	181	166	141	141
query92	81	77	73	73
query93	1457	1463	910	910
query94	534	371	300	300
query95	679	468	349	349
query96	1060	808	352	352
query97	2742	2692	2606	2606
query98	245	232	230	230
query99	1189	1136	1028	1028
Total cold run time: 254210 ms
Total hot run time: 170673 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 75.23% (82/109) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 86.24% (94/109) 🎉
Increment coverage report
Complete coverage report

@nsivarajan nsivarajan force-pushed the fix-query-filter-stats-part-2 branch from 162bfea to 6372d8a Compare May 28, 2026 02:13
@nsivarajan
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31659 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6372d8a842a28ba57f6f4f4329fb78a2a8ee42c5, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17632	4107	4127	4107
q2	q3	10810	1391	817	817
q4	4684	472	338	338
q5	7627	2282	2131	2131
q6	237	176	136	136
q7	910	810	636	636
q8	9365	1698	1597	1597
q9	5140	4984	4936	4936
q10	6387	2216	1931	1931
q11	424	269	246	246
q12	635	416	291	291
q13	18128	3425	2771	2771
q14	268	252	237	237
q15	q16	823	782	714	714
q17	961	879	998	879
q18	6863	5932	5512	5512
q19	1319	1319	1076	1076
q20	634	460	290	290
q21	6212	2833	2699	2699
q22	558	372	315	315
Total cold run time: 99617 ms
Total hot run time: 31659 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4926	4818	4759	4759
q2	q3	4957	5291	4671	4671
q4	2140	2172	1393	1393
q5	5013	4672	4695	4672
q6	230	177	134	134
q7	1854	1701	1581	1581
q8	2447	2159	2073	2073
q9	7705	7449	7512	7449
q10	4731	4669	4240	4240
q11	533	389	359	359
q12	727	739	521	521
q13	2987	3524	2826	2826
q14	283	270	265	265
q15	q16	674	699	609	609
q17	1277	1243	1244	1243
q18	7241	6895	6706	6706
q19	1118	1122	1131	1122
q20	2231	2227	1947	1947
q21	5253	4566	4433	4433
q22	525	496	422	422
Total cold run time: 56852 ms
Total hot run time: 51425 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171940 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6372d8a842a28ba57f6f4f4329fb78a2a8ee42c5, data reload: false

query5	4345	659	534	534
query6	352	217	207	207
query7	4235	551	315	315
query8	343	231	213	213
query9	8775	4148	4125	4125
query10	462	349	306	306
query11	5791	2431	2209	2209
query12	191	134	136	134
query13	1297	644	432	432
query14	6310	5516	5216	5216
query14_1	4570	4512	4527	4512
query15	214	205	191	191
query16	993	467	431	431
query17	1159	751	619	619
query18	2482	507	367	367
query19	224	213	165	165
query20	139	140	136	136
query21	221	143	123	123
query22	13623	13565	13362	13362
query23	17430	16672	16261	16261
query23_1	16445	16480	16413	16413
query24	7443	1784	1300	1300
query24_1	1351	1313	1347	1313
query25	554	472	435	435
query26	1303	317	177	177
query27	2695	572	346	346
query28	4388	2000	1996	1996
query29	982	632	490	490
query30	308	242	198	198
query31	1134	1100	965	965
query32	89	78	76	76
query33	539	360	294	294
query34	1190	1114	649	649
query35	775	812	703	703
query36	1391	1427	1283	1283
query37	156	103	94	94
query38	3237	3187	3046	3046
query39	928	916	920	916
query39_1	888	905	890	890
query40	230	145	123	123
query41	69	66	61	61
query42	110	108	108	108
query43	333	344	300	300
query44	
query45	215	205	201	201
query46	1117	1204	735	735
query47	2354	2367	2284	2284
query48	413	421	310	310
query49	655	501	392	392
query50	1030	344	244	244
query51	4404	4368	4224	4224
query52	103	104	96	96
query53	259	291	205	205
query54	304	268	265	265
query55	93	91	85	85
query56	306	299	317	299
query57	1438	1407	1336	1336
query58	306	273	261	261
query59	1614	1683	1477	1477
query60	317	361	310	310
query61	161	151	151	151
query62	692	668	592	592
query63	249	204	205	204
query64	2419	800	660	660
query65	
query66	1707	485	380	380
query67	29136	29764	29617	29617
query68	
query69	460	346	308	308
query70	1031	1042	952	952
query71	306	280	263	263
query72	2729	2744	2486	2486
query73	864	773	423	423
query74	5107	4921	4778	4778
query75	2712	2605	2296	2296
query76	2279	1115	770	770
query77	400	413	333	333
query78	12559	12369	11988	11988
query79	1415	1109	759	759
query80	657	547	456	456
query81	451	281	245	245
query82	1378	154	121	121
query83	355	280	255	255
query84	254	146	115	115
query85	887	538	447	447
query86	391	343	339	339
query87	3414	3391	3265	3265
query88	3623	2776	2751	2751
query89	439	385	348	348
query90	1979	183	183	183
query91	178	167	141	141
query92	77	78	74	74
query93	1504	1496	885	885
query94	527	359	311	311
query95	681	377	457	377
query96	1091	792	353	353
query97	2748	2703	2644	2644
query98	242	233	239	233
query99	1170	1149	1022	1022
Total cold run time: 253872 ms
Total hot run time: 171940 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 75.63% (90/119) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 55.21% (106/192) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants