How to convert Row to Dataset in spark Java2019 Community Moderator ElectionIs Java “pass-by-reference” or “pass-by-value”?How do I efficiently iterate over each entry in a Java Map?What is the difference between public, protected, package-private and private in Java?Fastest way to determine if an integer's square root is an integerHow do I read / convert an InputStream into a String in Java?When to use LinkedList over ArrayList in Java?How do I generate random integers within a specific range in Java?How do I determine whether an array contains a particular value in Java?How do I convert a String to an int in Java?Creating a memory leak with Java

How is the Swiss post e-voting system supposed to work, and how was it wrong?

If I can solve Sudoku can I solve Travelling Salesman Problem(TSP)? If yes, how?

PTIJ: Who should I vote for? (21st Knesset Edition)

Combining an idiom with a metonymy

Gantt Chart like rectangles with log scale

What is the least resource intensive way to generate the luatex font cache for a new font?

Should we release the security issues we found in our product as CVE or we can just update those on weekly release notes?

Have researchers managed to "reverse time"? If so, what does that mean for physics?

Is a party consisting of only a bard, a cleric, and a warlock functional long-term?

Is it possible to upcast ritual spells?

Can I use USB data pins as power source

Why is the President allowed to veto a cancellation of emergency powers?

What approach do we need to follow for projects without a test environment?

What exactly is this small puffer fish doing and how did it manage to accomplish such a feat?

Why one should not leave fingerprints on bulbs and plugs?

How to make healing in an exploration game interesting

Time travel from stationary position?

Define, (actually define) the "stability" and "energy" of a compound

How can I track script which gives me "command not found" right after the login?

My Graph Theory Students

What is the significance behind "40 days" that often appears in the Bible?

Unexpected result from ArcLength

A sequence that has integer values for prime indexes only:

How could a scammer know the apps on my phone / iTunes account?



How to convert Row to Dataset in spark Java



2019 Community Moderator ElectionIs Java “pass-by-reference” or “pass-by-value”?How do I efficiently iterate over each entry in a Java Map?What is the difference between public, protected, package-private and private in Java?Fastest way to determine if an integer's square root is an integerHow do I read / convert an InputStream into a String in Java?When to use LinkedList over ArrayList in Java?How do I generate random integers within a specific range in Java?How do I determine whether an array contains a particular value in Java?How do I convert a String to an int in Java?Creating a memory leak with Java










-1















I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java



Code:



groupedDataset.foreach((ForeachFunction<Row>) row -> 

double average = //some value

// the Row has four columns
// All I want is to have a new Dataset<Row> with specific columns
// from the Row i.e row(0),row(1),row(3) and average value

Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

);


I have tried a lot but I couldn't able to solve it.



Thank you!










share|improve this question




























    -1















    I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java



    Code:



    groupedDataset.foreach((ForeachFunction<Row>) row -> 

    double average = //some value

    // the Row has four columns
    // All I want is to have a new Dataset<Row> with specific columns
    // from the Row i.e row(0),row(1),row(3) and average value

    Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

    );


    I have tried a lot but I couldn't able to solve it.



    Thank you!










    share|improve this question


























      -1












      -1








      -1








      I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java



      Code:



      groupedDataset.foreach((ForeachFunction<Row>) row -> 

      double average = //some value

      // the Row has four columns
      // All I want is to have a new Dataset<Row> with specific columns
      // from the Row i.e row(0),row(1),row(3) and average value

      Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

      );


      I have tried a lot but I couldn't able to solve it.



      Thank you!










      share|improve this question
















      I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java



      Code:



      groupedDataset.foreach((ForeachFunction<Row>) row -> 

      double average = //some value

      // the Row has four columns
      // All I want is to have a new Dataset<Row> with specific columns
      // from the Row i.e row(0),row(1),row(3) and average value

      Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

      );


      I have tried a lot but I couldn't able to solve it.



      Thank you!







      java apache-spark apache-spark-sql






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 7 at 17:40









      gudok

      2,79121324




      2,79121324










      asked Mar 7 at 14:23









      VigneshVignesh

      1011221




      1011221






















          1 Answer
          1






          active

          oldest

          votes


















          0














          Rows are not supposed to be modified directly (it is possible but not convenient). When manipulating dataframes (Dataset of rows), you are supposed to use the SparkSQL API for two main reasons: 1. it's easy to use 2. it allows spark to perform a lot of optimizations on your requests.



          Now, here is an example that seem to look like what you are trying to achieve. Basically I create a dataset with three columns and I use a select to average the result of two of them and discard the last one. Let me know if you need more details.



          SparkSession spark = SparkSession.builder().getOrCreate();
          Dataset<Row> data = spark
          .range(10)
          .select(col("id").as("id"),
          col("id").cast("string").as("str"),
          col("id").plus(5).as("id5") );
          data.show();

          Dataset<Row> result = data
          .select(col("id"), col("id5"),
          col("id").plus(col("id5")).divide(2).as("avg"));

          result.show();


          which yields:



          +---+---+---+
          | id|str|id5|
          +---+---+---+
          | 0| 0| 5|
          | 1| 1| 6|
          | 2| 2| 7|
          +---+---+---+

          +---+---+---+
          | id|id5|avg|
          +---+---+---+
          | 0| 5|2.5|
          | 1| 6|3.5|
          | 2| 7|4.5|
          +---+---+---+





          share|improve this answer






















            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55046091%2fhow-to-convert-row-to-datasetrow-in-spark-java%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Rows are not supposed to be modified directly (it is possible but not convenient). When manipulating dataframes (Dataset of rows), you are supposed to use the SparkSQL API for two main reasons: 1. it's easy to use 2. it allows spark to perform a lot of optimizations on your requests.



            Now, here is an example that seem to look like what you are trying to achieve. Basically I create a dataset with three columns and I use a select to average the result of two of them and discard the last one. Let me know if you need more details.



            SparkSession spark = SparkSession.builder().getOrCreate();
            Dataset<Row> data = spark
            .range(10)
            .select(col("id").as("id"),
            col("id").cast("string").as("str"),
            col("id").plus(5).as("id5") );
            data.show();

            Dataset<Row> result = data
            .select(col("id"), col("id5"),
            col("id").plus(col("id5")).divide(2).as("avg"));

            result.show();


            which yields:



            +---+---+---+
            | id|str|id5|
            +---+---+---+
            | 0| 0| 5|
            | 1| 1| 6|
            | 2| 2| 7|
            +---+---+---+

            +---+---+---+
            | id|id5|avg|
            +---+---+---+
            | 0| 5|2.5|
            | 1| 6|3.5|
            | 2| 7|4.5|
            +---+---+---+





            share|improve this answer



























              0














              Rows are not supposed to be modified directly (it is possible but not convenient). When manipulating dataframes (Dataset of rows), you are supposed to use the SparkSQL API for two main reasons: 1. it's easy to use 2. it allows spark to perform a lot of optimizations on your requests.



              Now, here is an example that seem to look like what you are trying to achieve. Basically I create a dataset with three columns and I use a select to average the result of two of them and discard the last one. Let me know if you need more details.



              SparkSession spark = SparkSession.builder().getOrCreate();
              Dataset<Row> data = spark
              .range(10)
              .select(col("id").as("id"),
              col("id").cast("string").as("str"),
              col("id").plus(5).as("id5") );
              data.show();

              Dataset<Row> result = data
              .select(col("id"), col("id5"),
              col("id").plus(col("id5")).divide(2).as("avg"));

              result.show();


              which yields:



              +---+---+---+
              | id|str|id5|
              +---+---+---+
              | 0| 0| 5|
              | 1| 1| 6|
              | 2| 2| 7|
              +---+---+---+

              +---+---+---+
              | id|id5|avg|
              +---+---+---+
              | 0| 5|2.5|
              | 1| 6|3.5|
              | 2| 7|4.5|
              +---+---+---+





              share|improve this answer

























                0












                0








                0







                Rows are not supposed to be modified directly (it is possible but not convenient). When manipulating dataframes (Dataset of rows), you are supposed to use the SparkSQL API for two main reasons: 1. it's easy to use 2. it allows spark to perform a lot of optimizations on your requests.



                Now, here is an example that seem to look like what you are trying to achieve. Basically I create a dataset with three columns and I use a select to average the result of two of them and discard the last one. Let me know if you need more details.



                SparkSession spark = SparkSession.builder().getOrCreate();
                Dataset<Row> data = spark
                .range(10)
                .select(col("id").as("id"),
                col("id").cast("string").as("str"),
                col("id").plus(5).as("id5") );
                data.show();

                Dataset<Row> result = data
                .select(col("id"), col("id5"),
                col("id").plus(col("id5")).divide(2).as("avg"));

                result.show();


                which yields:



                +---+---+---+
                | id|str|id5|
                +---+---+---+
                | 0| 0| 5|
                | 1| 1| 6|
                | 2| 2| 7|
                +---+---+---+

                +---+---+---+
                | id|id5|avg|
                +---+---+---+
                | 0| 5|2.5|
                | 1| 6|3.5|
                | 2| 7|4.5|
                +---+---+---+





                share|improve this answer













                Rows are not supposed to be modified directly (it is possible but not convenient). When manipulating dataframes (Dataset of rows), you are supposed to use the SparkSQL API for two main reasons: 1. it's easy to use 2. it allows spark to perform a lot of optimizations on your requests.



                Now, here is an example that seem to look like what you are trying to achieve. Basically I create a dataset with three columns and I use a select to average the result of two of them and discard the last one. Let me know if you need more details.



                SparkSession spark = SparkSession.builder().getOrCreate();
                Dataset<Row> data = spark
                .range(10)
                .select(col("id").as("id"),
                col("id").cast("string").as("str"),
                col("id").plus(5).as("id5") );
                data.show();

                Dataset<Row> result = data
                .select(col("id"), col("id5"),
                col("id").plus(col("id5")).divide(2).as("avg"));

                result.show();


                which yields:



                +---+---+---+
                | id|str|id5|
                +---+---+---+
                | 0| 0| 5|
                | 1| 1| 6|
                | 2| 2| 7|
                +---+---+---+

                +---+---+---+
                | id|id5|avg|
                +---+---+---+
                | 0| 5|2.5|
                | 1| 6|3.5|
                | 2| 7|4.5|
                +---+---+---+






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 7 at 16:50









                OliOli

                1,408414




                1,408414





























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55046091%2fhow-to-convert-row-to-datasetrow-in-spark-java%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

                    How to get text form Clipboard with JavaScript in Firefox 56?How to validate an email address in JavaScript?How do JavaScript closures work?How do I remove a property from a JavaScript object?How do you get a timestamp in JavaScript?How do I copy to the clipboard in JavaScript?How do I include a JavaScript file in another JavaScript file?Get the current URL with JavaScript?How to replace all occurrences of a string in JavaScriptHow to check whether a string contains a substring in JavaScript?How do I remove a particular element from an array in JavaScript?

                    Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme