常规分组

阅读(2980) 标签: 常规分组,

本章主要展示关于常规分组的代码参考,比如找出字段的唯一值、删除重复成员、去除相邻的重复成员、将分组的结果再合并成序表、每5个成员分作一组、计算分组后汇总值、取出分组结果的子集、对分组结果再过滤或排序、对分组结果的子集再过滤或排序、对分组结果的子集再做分组、针对组内成员做跨行计算、取出每个分组子集中的某个成员、找出出现次数最多的成员、分组后的TOPN、找出连续数列、相邻相同键值分组、条件不同另起分组、迭代函数分组聚合等。

找出字段的唯一值

 

A

 

1

=demo.query("select NAME,EVENT,SCORE from GYMSCORE")

 

2

=A1.id(EVENT)

 

 

删除重复成员

 

A

 

1

=demo.query("select NAME,TYPE, PRODUCTION from LIQUORS")

 

2

=A1.id@u(TYPE)

去掉重复的成员,不改变成员次序

3

=A1.group@1u(TYPE)

 

4

=[1,2,2,3,3,4,5,6,2,3].id@u()

[1,2,3,4,5,6]

 

去除相邻的重复成员

 

A

 

1

=demo.query("select NAME,TYPE, PRODUCTION from LIQUORS")

 

2

=A1.id@o(TYPE)

不排序,仅去掉相邻的重复成员

3

=A1.group@1o(TYPE)

 

4

=[1,2,5,5,3,4,5,6,2,3].id@o()

[1,2,5,3,4,5,6,2,3]

 

将分组的结果再合并成序表

 

A

 

1

=demo.query("select NAME,EVENT,SCORE from GYMSCORE")

 

2

=A1.group@s(EVENT)

 

3

=A1.group(EVENT)

 

4

=A2.conj()

 

 

每 5 个成员分作一组

 

A

 

1

=demo.query("select NAME,EVENT,SCORE from GYMSCORE")

 

2

=A1.group(int((#-1)/5))

 

 

计算分组后汇总值

 

A

 

1

=demo.query("select NAME,EVENT,SCORE from GYMSCORE")

 

2

=A1.groups(EVENT:GymEvent; sum(SCORE):TotalScore)

 

3

=A1.group(EVENT)

 

4

=A3.new(EVENT: GymEvent,~.sum(SCORE): TotalScore)

 

 

取出分组结果的子集

 

A

 

1

=demo.query("select EID,NAME,STATE, GENDER, BIRTHDAY,HIREDATE,DEPT,SALARY from EMPLOYEE")

 

2

=A1.group(DEPT)

 

3

=A2.maxp(~.avg(age(BIRTHDAY)))

平均年龄最大的部门员工记录

 

对分组结果再过滤或排序

 

A

 

1

=demo.query("select NAME,EVENT,SCORE from GYMSCORE")

 

2

=A1.group(EVENT)

 

3

=A2.select(~.avg(SCORE)>14.3)

 

4

=A3.sort(-(~.avg(SCORE)))

 

5

=A4(to(2))

平均分最大的2个项目

6

=demo.query("select EID,NAME,STATE,GENDER, BIRTHDAY,HIREDATE,DEPT,SALARY from EMPLOYEE")

 

7

=A6.groups(DEPT;count(age(BIRTHDAY)>40):Number)

 

8

=A7.select(Number>=20).(DEPT)

大于40岁的员工人数超过20个的部门

 

对分组结果的子集再过滤或排序

 

A

 

1

=demo.query("select NAME,EVENT,SCORE from GYMSCORE")

 

2

=A1.group(EVENT)

 

3

>A2.(~=~.sort(-SCORE))

 

4

>A2.(~=~(to(2)))

 

5

=A2.(~.(NAME)).isect()

所有项目都前2名的运动员

 

对分组结果的子集再做分组

 

A

 

1

=demo.query("select EID,NAME,STATE,GENDER, BIRTHDAY,HIREDATE,DEPT,SALARY from EMPLOYEE")

 

2

=A1.group(DEPT)

 

3

>A2.(~=~.group(month(BIRTHDAY),day(BIRTHDAY)))

对子集再分组

4

=A2.maxp(~.count())

 

5

=A4(1).DEPT

 

 

针对组内成员做跨行计算

 

A

B

C

 

1

=demo.query("select * from STOCKRECORDS")

 

 

 

2

=A1.group(STOCKID).(~.sort(DATE))

 

 

 

3

for A2

=0

 

 

4

 

if A3.pselect(B3= if( CLOSING/CLOSING[-1] >=1.05,B3+1,0):4)>0

 

有三天涨停

5

 

 

=C5|A3.STOCKID

记录结果

 

取出每个分组子集中的某个成员

 

A

 

1

=demo.query("select NAME,TYPE,PRODUCTION from LIQUORS")

 

2

=A1.group(TYPE).new(TYPE,~.m(-1):Last)

直接分组聚合

3

=demo.query("select EID,NAME,STATE,GENDER, BIRTHDAY,HIREDATE,DEPT,SALARY from EMPLOYEE ")

 

4

=A3.group(DEPT)

先分组

5

=A4.(~.minp(SALARY))

后聚合

 

找出出现次数最多的成员

 

A

 

1

=demo.query("select EID,NAME,STATE,GENDER, BIRTHDAY,HIREDATE,DEPT,SALARY from EMPLOYEE ")

 

2

=A1.group(DEPT)

分组

3

=A2.maxp(~.count())

找出雇员最多的组

4

=A3(1).DEPT

雇员数最多的部门

分组后的 TOPN

 

A

 

1

=demo.query("select * from EMPLOYEE ")

 

2

=A1.groups(DEPT;top(3;BIRTHDAY):TOP3_EMP)

各部门年龄最大的3个成员

 

找出连续数列

 

A

 

1

[1,3,4,5,8,9,15,16,20]

 

2

=A1.group(~-#).select(~.len()>1)

 

 

相邻相同键值分组

 

A

 

1

=demo.query("select* from SCORES")

 

2

=A1.groups@o(CLASS,STUDENTID;sum(SCORE):SCORE)

 

 

条件不同另起分组

 

A

 

1

=demo.query("select* from STOCKRECORDS")

按日期有序

2

=A1.select(STOCKID:"000062")

选择000062这支股票

3

=A2.group@i(CLOSING<CLOSING[-1])

条件为true开始新分组

4

=A3.max(~.len())-1

连续增长多少天

 

迭代函数分组聚合

 

A

 

1

=file("E:/txt/orders_i.csv").import@t()

 

2

=A1.group(sellerid;(~.iterate((x=#,~~+amount),0,~~>500000),x):breach50)

计算各销售员几个月销售额超过50