# distream **Repository Path**: huoyo/distream ## Basic Information - **Project Name**: distream - **Description**: List扩展库,用于list对象流式处理,包括自定义数据处理器、labmda表达式和等式计算等(list.handle(a->...).handle(a->...).handle(a->...))。 an extended tool of List about how to process data by lambda,expressions and custom class - **Primary Language**: Java - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: https://gitee.com/huoyo/distream.git - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 1 - **Created**: 2022-01-28 - **Last Updated**: 2024-07-28 ## Categories & Tags **Categories**: utils **Tags**: None ## README

Distream

--- [英文文档/ENGLISH](README-EN.md)

一个为Java语言开发的List扩展工具库,可用于list对象流式数据处理,包括自定义数据处理器、lambda表达式和等式计算等

An extended tool of List about how to process data fluently by lambda,expressions and custom class.

* 真正的数据流式丝滑处理 ```java ListFrame lines = ListFrame.fromList(list); double sum = lines.get("value").sum(); lines = lines .handle("value=format(value,2)") //round to the nearest hundredth .handle(line->line.getName()==null,"name=''") //if(line.getName()==null){line.setName('');} .handle(line->line.getValue()==null,"value=0","value=value+2") //value = line.getValue()==null?0:line.getValue()+2; .handle("name=replace(name,'#','')") //replace '#' to '' .handle("percent=double(value)/"+sum) //converting value's tyle to double and computing percent .groupBy("name").sum("percent"); //groupBy 'name' ``` * 性能更好的写法(v1.1.0+支持) ```java ListFrame lines = ListFrame.fromList(list); double sum = lines.get("value").sum(); lines = lines .addHandler("value=format(value,2)") .addHandler("name=replace(name,'#','')") .addHandler("percent=double(value)/"+sum) .execute() .groupBy("name").sum("percent"); //groupBy 'name' ``` * 更推荐的写法(v1.1.0+支持,性能最好) ```java ListFrame lines = ListFrame.fromList(list); lines = lines .addHandler(new DataHandler1()) .addHandler(new DataHandler2()) .addHandler(new DataHandler3()) .addHandler(new DataHandler4()) .execute(); //DataHandler1、2、3、4需要实现DataHandler接口 ``` * 便携数据库读取 ```java Datasource datesource = xxx; ListFrame list = new ListFrame(); list.initDataSource(datesource); ListFrame> lines = list.readSql("select * from xxx").handle(a->...).handle(a->...)...; ``` > 注意:读取数据库的时候需要引入对应的连接驱动,比如Mysql: ``` mysql mysql-connector-java 8.0.22 ``` #### 引入 引入maven依赖 ``` cn.langpy distream 1.1.2 ``` #### 数据读取与转换 ##### 0.假设有如下文件 ``` 序号,姓名,年龄,收入 1,张三,23,5000.11 2,李四,22,4000.22 3,李二狗,20,5000.33 4,韦陀掌,23,3000.44 5,拈花指,23,2000.55 6,小六子,18,5000.66 7,杨潇,23,3000.77 8,李留,19,5000.55 ``` ##### 1.按行读取文件 按行读取文件,并在每一行的结尾添加";",每一行的开头添加"=>" ```java ListFrame lines = ListFrame.readString("test.txt"); lines = lines .handle(line -> line + ";") //add ";" at the end of every line .handle(line -> "=>"+line ); //add "=>" at the front of every line ``` ##### 2.按map读取csv文件 ```java /*read easily*/ //ListFrame> lines = ListFrame.readMap(path); /*read by split symbol*/ //ListFrame> lines = ListFrame.readMap(path,","); /*define data types*/ ListFrame> lines = ListFrame.readMap(path,new Class[]{Integer.class,String.class,Integer.class,Double.class}); lines = lines .handle("收入=收入*0.8") .handle("序号='0'+序号;姓名=序号+姓名")//add "0" at the front of 序号;rename 姓名 by 序号+姓名 .handle(new MapHandler());//add a key named "newKey" whose value is 1 ;MapHandler can be seen as follows ``` 自定义一个数据处理器,需实现DataHandler,E的类型为list中每一个对象的类型 ```java public class MapHandler implements DataHandler> { @Override public Map handle(Map line) { line.put("newKey",1); return line; } } ``` 也可以简写为: ```java lines = lines(map->{ map.put("newKey",1); return map; }); ``` ##### 3.按列获取数据 按列读取数据并求最大最小值以及平均值 ```java /*obtain data by column name*/ ListFrame indexs = lines.get("收入"); /*you can user ObjectName::getXX if ListFrame's elements are java objects*/ //ListFrame indexs = lines.get(User::getAge); double maxIncome = indexs.max(); double minIncome = indexs.min(); double avgIncome = indexs.avg(); /*the index of max*/ int argmax= listFrame.argmax(); /*the index of min*/ int argmin= listFrame.argmin(); ``` ##### 4.分组求和 ```java MapFrame agesGroup = lines.groupBy("年龄"); MapFrame count = agesGroup.count(); MapFrame incomeAvg = agesGroup.avg("收入"); MapFrame incomeSum = agesGroup.sum("收入"); MapFrame incomeConcat = agesGroup.concat("收入"); /*continuous groupBy*/ MapFrame> incomeAgeConcat = lines.groupBy("收入").groupBy("年龄"); ``` ##### 5.保存成文件 ```java /*save to file*/ lines.toFile("save.txt"); ``` ##### 6.List与ListFrame的转换 ```java List list = ...; ListFrame lines = ListFrame.fromList(list); list = lines.toList(); ``` ##### 7.Map与对象互转 ```java ListFrame lines = ListFrame.readMap(path); ListFrame users = lines.toObjectList(User.class); ListFrame maps = users.toMapList(); ``` ##### 8.数据替换 ```java /*replace "xxx" to "yyy"*/ lines = lines.replace("需要替换的列","xxx","yyy"); ``` ##### 9.类型转化 ```java List list = Arrays.asList("1","2","3","4"); ListFrame listFrame = ListFrame.fromList(list ); ListFrame listInt= listFrame.asInteger(); ListFrame listDouble= listFrame.asDouble(); ListFrame listFloat= listFrame.asFloat(); ListFrame listString= listFloat.asString(); ``` ##### 10.统计元素个数 ```java List list = Arrays.asList(2,2,2,4); MapFrame listFrame = ListFrame.fromList(list).frequency() /*得到map {2=3,4=1}*/ ``` ##### 11.方差和标准差 ```java List list = Arrays.asList(2,2,2,4); ListFrame listFrame = ListFrame.fromList(list ); listFrame.variance();//方差 listFrame.standardDeviation();//标准差 ``` ##### 12.剔除null值 如果一个list中存在为null的存在需要遍历剔除,可以直接使用如下函数: ```java List list = Arrays.asList(2,null,2,null,6); ListFrame listFrame = ListFrame.fromList(list ); listFrame = listFrame.dropNull(); //[2,null,2,null,6]->[2,2,6] ``` ##### 13.去重 ```java List list = Arrays.asList(2,2,2,6,6); ListFrame listFrame = ListFrame.fromList(list ); listFrame = listFrame.distinct(); //[2,2,2,6,6]->[2,6] ``` ##### 14.常用函数 ```java ListFrame> lines = xxx; /*convert code to int*/ lines = lines.handle("id=int(code)"); /*convert value to double*/ lines = lines.handle("percent=double(value)"); /*convert value to string*/ lines = lines.handle("name=string(value)"); /*substring is like java substring*/ lines = lines.handle("name=substring(name,1,2)"); /*replace "xxx" to "yyy"*/ lines = lines.handle("name=replace(name,'xxx','yyy')"); /*you can alse use '-' to replace if you only want to replace 'xxx' */ lines = lines.handle("name=name-'xxx'"); /*index is like java indexof*/ lines = lines.handle("id=index(name,'xxx')"); /*round to the nearest hundredth*/ lines = lines.handle("percent=format(percent,2)"); ``` #### 版权说明 > 1.本项目版权属作者所有,并使用 Apache-2.0进行开源; > > 2.您可以使用本项目进行学习、商用或者开源,但任何使用了本项目的代码的软件和项目请尊重作者的原创权利; > > 3.如果您使用并修改了本项目的源代码,请注明修改内容以及出处; > > 4.其他内容请参考Apache-2.0