How to convert Row to Dataset in spark Java2019 Community Moderator ElectionIs Java “pass-by-reference” or “pass-by-value”?How do I efficiently iterate over each entry in a Java Map?What is the difference between public, protected, package-private and private in Java?Fastest way to determine if an integer's square root is an integerHow do I read / convert an InputStream into a String in Java?When to use LinkedList over ArrayList in Java?How do I generate random integers within a specific range in Java?How do I determine whether an array contains a particular value in Java?How do I convert a String to an int in Java?Creating a memory leak with Java

How is the Swiss post e-voting system supposed to work, and how was it wrong?

If I can solve Sudoku can I solve Travelling Salesman Problem(TSP)? If yes, how?

PTIJ: Who should I vote for? (21st Knesset Edition)

Combining an idiom with a metonymy

Gantt Chart like rectangles with log scale

What is the least resource intensive way to generate the luatex font cache for a new font?

Should we release the security issues we found in our product as CVE or we can just update those on weekly release notes?

Have researchers managed to "reverse time"? If so, what does that mean for physics?

Is a party consisting of only a bard, a cleric, and a warlock functional long-term?

Is it possible to upcast ritual spells?

Can I use USB data pins as power source

Why is the President allowed to veto a cancellation of emergency powers?

What approach do we need to follow for projects without a test environment?

What exactly is this small puffer fish doing and how did it manage to accomplish such a feat?

Why one should not leave fingerprints on bulbs and plugs?

How to make healing in an exploration game interesting

Time travel from stationary position?

Define, (actually define) the "stability" and "energy" of a compound

How can I track script which gives me "command not found" right after the login?

My Graph Theory Students

What is the significance behind "40 days" that often appears in the Bible?

Unexpected result from ArcLength

A sequence that has integer values for prime indexes only:

How could a scammer know the apps on my phone / iTunes account?

How to convert Row to Dataset in spark Java

2019 Community Moderator ElectionIs Java “pass-by-reference” or “pass-by-value”?How do I efficiently iterate over each entry in a Java Map?What is the difference between public, protected, package-private and private in Java?Fastest way to determine if an integer's square root is an integerHow do I read / convert an InputStream into a String in Java?When to use LinkedList over ArrayList in Java?How do I generate random integers within a specific range in Java?How do I determine whether an array contains a particular value in Java?How do I convert a String to an int in Java?Creating a memory leak with Java

-1

I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java

Code:

groupedDataset.foreach((ForeachFunction<Row>) row -> 

 double average = //some value

 // the Row has four columns
 // All I want is to have a new Dataset<Row> with specific columns
 // from the Row i.e row(0),row(1),row(3) and average value

 Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

);

I have tried a lot but I couldn't able to solve it.

Thank you!

edited Mar 7 at 17:40

gudok

2,79121324

asked Mar 7 at 14:23

Vignesh

1011221

add a comment |

-1

I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java

Code:

groupedDataset.foreach((ForeachFunction<Row>) row -> 

 double average = //some value

 // the Row has four columns
 // All I want is to have a new Dataset<Row> with specific columns
 // from the Row i.e row(0),row(1),row(3) and average value

 Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

);

I have tried a lot but I couldn't able to solve it.

Thank you!

edited Mar 7 at 17:40

gudok

2,79121324

asked Mar 7 at 14:23

Vignesh

1011221

add a comment |

-1

I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java

Code:

groupedDataset.foreach((ForeachFunction<Row>) row -> 

 double average = //some value

 // the Row has four columns
 // All I want is to have a new Dataset<Row> with specific columns
 // from the Row i.e row(0),row(1),row(3) and average value

 Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

);

I have tried a lot but I couldn't able to solve it.

Thank you!

edited Mar 7 at 17:40

gudok

2,79121324

asked Mar 7 at 14:23

Vignesh

1011221

I'm Iterating a Dataset<Row> using ForeachFunction while in the iteration I don't know how to append some custom columns to the Row and and append it to another Dataset<Row> in spark Java

Code:

groupedDataset.foreach((ForeachFunction<Row>) row -> 

 double average = //some value

 // the Row has four columns
 // All I want is to have a new Dataset<Row> with specific columns
 // from the Row i.e row(0),row(1),row(3) and average value

 Dataset<Row> newDs = row.getString("ID"),row.getString("time"),row.getInt("value"),average;

);

I have tried a lot but I couldn't able to solve it.

Thank you!

java apache-spark apache-spark-sql

edited Mar 7 at 17:40

gudok

2,79121324

asked Mar 7 at 14:23

Vignesh

1011221

edited Mar 7 at 17:40

gudok

2,79121324

asked Mar 7 at 14:23

Vignesh

1011221

edited Mar 7 at 17:40

gudok

2,79121324

edited Mar 7 at 17:40

gudok

2,79121324

edited Mar 7 at 17:40

gudok

2,79121324

asked Mar 7 at 14:23

Vignesh

1011221

asked Mar 7 at 14:23

Vignesh

1011221

asked Mar 7 at 14:23

Vignesh

1011221

add a comment |

1 Answer
1

active

oldest

votes

Rows are not supposed to be modified directly (it is possible but not convenient). When manipulating dataframes (Dataset of rows), you are supposed to use the SparkSQL API for two main reasons: 1. it's easy to use 2. it allows spark to perform a lot of optimizations on your requests.

Now, here is an example that seem to look like what you are trying to achieve. Basically I create a dataset with three columns and I use a select to average the result of two of them and discard the last one. Let me know if you need more details.

SparkSession spark = SparkSession.builder().getOrCreate();
Dataset<Row> data = spark
 .range(10)
 .select(col("id").as("id"),
 col("id").cast("string").as("str"),
 col("id").plus(5).as("id5") );
 data.show();

 Dataset<Row> result = data
 .select(col("id"), col("id5"),
 col("id").plus(col("id5")).divide(2).as("avg"));

 result.show();

which yields:

+---+---+---+
| id|str|id5|
+---+---+---+
| 0| 0| 5|
| 1| 1| 6|
| 2| 2| 7|
+---+---+---+ 

+---+---+---+
| id|id5|avg|
+---+---+---+
| 0| 5|2.5|
| 1| 6|3.5|
| 2| 7|4.5|
+---+---+---+

answered Mar 7 at 16:50

Oli

1,408414

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55046091%2fhow-to-convert-row-to-datasetrow-in-spark-java%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

SparkSession spark = SparkSession.builder().getOrCreate();
Dataset<Row> data = spark
 .range(10)
 .select(col("id").as("id"),
 col("id").cast("string").as("str"),
 col("id").plus(5).as("id5") );
 data.show();

 Dataset<Row> result = data
 .select(col("id"), col("id5"),
 col("id").plus(col("id5")).divide(2).as("avg"));

 result.show();

which yields:

+---+---+---+
| id|str|id5|
+---+---+---+
| 0| 0| 5|
| 1| 1| 6|
| 2| 2| 7|
+---+---+---+ 

+---+---+---+
| id|id5|avg|
+---+---+---+
| 0| 5|2.5|
| 1| 6|3.5|
| 2| 7|4.5|
+---+---+---+

answered Mar 7 at 16:50

Oli

1,408414

add a comment |

SparkSession spark = SparkSession.builder().getOrCreate();
Dataset<Row> data = spark
 .range(10)
 .select(col("id").as("id"),
 col("id").cast("string").as("str"),
 col("id").plus(5).as("id5") );
 data.show();

 Dataset<Row> result = data
 .select(col("id"), col("id5"),
 col("id").plus(col("id5")).divide(2).as("avg"));

 result.show();

which yields:

+---+---+---+
| id|str|id5|
+---+---+---+
| 0| 0| 5|
| 1| 1| 6|
| 2| 2| 7|
+---+---+---+ 

+---+---+---+
| id|id5|avg|
+---+---+---+
| 0| 5|2.5|
| 1| 6|3.5|
| 2| 7|4.5|
+---+---+---+

answered Mar 7 at 16:50

Oli

1,408414

add a comment |

SparkSession spark = SparkSession.builder().getOrCreate();
Dataset<Row> data = spark
 .range(10)
 .select(col("id").as("id"),
 col("id").cast("string").as("str"),
 col("id").plus(5).as("id5") );
 data.show();

 Dataset<Row> result = data
 .select(col("id"), col("id5"),
 col("id").plus(col("id5")).divide(2).as("avg"));

 result.show();

which yields:

+---+---+---+
| id|str|id5|
+---+---+---+
| 0| 0| 5|
| 1| 1| 6|
| 2| 2| 7|
+---+---+---+ 

+---+---+---+
| id|id5|avg|
+---+---+---+
| 0| 5|2.5|
| 1| 6|3.5|
| 2| 7|4.5|
+---+---+---+

answered Mar 7 at 16:50

Oli

1,408414

SparkSession spark = SparkSession.builder().getOrCreate();
Dataset<Row> data = spark
 .range(10)
 .select(col("id").as("id"),
 col("id").cast("string").as("str"),
 col("id").plus(5).as("id5") );
 data.show();

 Dataset<Row> result = data
 .select(col("id"), col("id5"),
 col("id").plus(col("id5")).divide(2).as("avg"));

 result.show();

which yields:

+---+---+---+
| id|str|id5|
+---+---+---+
| 0| 0| 5|
| 1| 1| 6|
| 2| 2| 7|
+---+---+---+ 

+---+---+---+
| id|id5|avg|
+---+---+---+
| 0| 5|2.5|
| 1| 6|3.5|
| 2| 7|4.5|
+---+---+---+

answered Mar 7 at 16:50

Oli

1,408414

answered Mar 7 at 16:50

Oli

1,408414

answered Mar 7 at 16:50

Oli

1,408414

answered Mar 7 at 16:50

Oli

1,408414

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggtcf

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

1 Answer
1

1 Answer
1

1 Answer
1