• 分组后合并记录中的字段值


    【问题】

    As for example i have this data in csv file which has the column names as: “people”, “committers”, "repositoryCommitters

    The “people” column has the ids from 1-5923 and i want to match the ids if they have the common repository from the “repositoryCommitters” column like for example:

    1. people | repositoryCommitters
    2. 1 | x
    3. 2 | x
    4. 3 | y

    people id 1 and 2 has the common repo “x” and how do i get this ids and print in the output like:

    1. *Edges
    2. 1 2

    means 1 and 2 are link because they have the common repository.

    For now the code i have is:

    1. package network;
    2. import java.io.BufferedReader;
    3. import java.io.BufferedWriter;
    4. import java.io.File;
    5. import java.io.FileNotFoundException;
    6. import java.io.FileReader;
    7. import java.io.FileWriter;
    8. import java.io.IOException;
    9. import java.io.LineNumberReader;
    10. import java.io.PrintStream;
    11. import java.io.Writer;
    12. import java.util.ArrayList;
    13. import java.util.Scanner;
    14. public class Read {
    15. static String line;
    16. static BufferedReader br1 = null, br2 =null;
    17. static ArrayList<String> pList = new ArrayList<String>();
    18. static ArrayList<String> rList = new ArrayList<String>();
    19. static File fileName = new File("networkBuilder.txt");
    20. public static void main(String\[\] args) throws IOException
    21. { String fileContent = "*Vertices " ;
    22. System.out.println("Enter your current directory: ");
    23. Scanner scanner = new Scanner(System.in);
    24. String directory = scanner.nextLine();
    25. try {
    26. br1 = new BufferedReader(new FileReader(directory + "//people.csv"));
    27. br2 = new BufferedReader(new FileReader(directory + "//repo.csv"));
    28. } catch(FileNotFoundException e)
    29. {
    30. System.out.println(e.getMessage() + " \\n file not found re-run and try again");
    31. System.exit(0);
    32. }
    33. int count = 0;
    34. try {
    35. while((line = br1.readLine()) != null){ //skip first line
    36. while((line = br1.readLine()) != null)
    37. {
    38. pList.add(line); // add to array list
    39. count++ ;
    40. } }
    41. } catch (IOException error) {
    42. System.out.println(error.getMessage() + "Error reading file");
    43. }
    44. \**Vertices**\
    45. System.out.println("\\n"); // new line
    46. System.out.println(fileContent + count); //print out vertices
    47. //print out each item in the ArrayList
    48. int size = pList.size();
    49. for(int i=0; i < size; i++){
    50. String\[\] data=(pList.get(i)).split(",");
    51. System.out.println(data\[1\]);
    52. }
    53. // Save the console output in a text file
    54. try{
    55. PrintStream myconsole = new PrintStream(new File(directory + "network.txt"));
    56. System.setOut(myconsole);
    57. //print out each item in the ArrayList
    58. int sz = pList.size(); System.out.println(fileContent + count); //print out vertices
    59. for(int i=0; i < sz; i++){
    60. String\[\] data=(pList.get(i)).split(",");
    61. System.out.println(data\[1\]);
    62. }
    63. } catch(Exception er){
    64. }
    65. /* try{
    66. FileWriter fw = new FileWriter(fileName);
    67. Writer output = new BufferedWriter(fw);
    68. int size = pList.size();
    69. for(int j=0; j<size; j++){
    70. output.write(fileContent + count);
    71. ((BufferedWriter) output).newLine();
    72. output.write(pList.get(j) + "\\n");
    73. ((BufferedWriter) output).newLine();
    74. }
    75. output.close();
    76. } */
    77. /** Edges**/
    78. fileContent = "\\n*Edges";
    79. System.out.println(fileContent);
    80. // peopleCSV();
    81. // repoCSV();
    82. } // end of main
    83. }

    And the output is:

    Enter your current directory:

    _C:\Users\StudentDoubts\Documents

    1. *Vertices 5923
    2. 1
    3. 2
    4. 3 . . .

    【回答】

    根据第二列分组,组内将第 1 列合并到同一行,硬编码实现这种算法太复杂,这种情况用集算器实现更方便,SPL 代码简单易懂:

    A
    1=file(“people.txt”).import@t(;,"|")
    2=A1.group(repositoryCommitters).new(~.(people).concat(“ “):*Edges)
    3=file("D:/result.txt").export@t(A2)

    如果想给输出的每行加上 repositoryCommitters,只需要将 A2 改为

    =A1.group(repositoryCommitters).new(~.(people).string(" "):*Edges,repositoryCommitters:repositoryCommitters)

    集算器提供了 JDBC 接口,可以像数据库一样使用,Java 如何调用 SPL 脚本

     

  • 相关阅读:
    NXP官方uboot针对ALPHA开发板网络驱动更改网口
    SpringMVC:转发和重定向
    [李宏毅深度学习作业] 作业1:ML2021Spring-hw1 COVID-19 Cases Prediction【以时间线为记录】
    Linux最常用命令用法总结(精选)
    `英语` 2022/8/27
    JDBC003--java中执行update修改数据表操作
    SPASS-回归分析
    vue(十二)——vue3新特性之Teleport
    Ribbon
    angular中多层嵌套结构的表单如何处理回显问题
  • 原文地址:https://blog.csdn.net/raqsoft/article/details/128012356