reading data from large file into struct in CHow can I read and parse CSV files in C++?What's the difference between struct and class in .NET?When should you use a class vs a struct in C++?Why isn't sizeof for a struct equal to the sum of sizeof of each member?Why are mutable structs “evil”?When to use struct?Difference between 'struct' and 'typedef struct' in C++?problem with flushing input stream Ctypedef struct vs struct definitions.dat structured file handling in C (manual ?)Loading binary file to unknown struct type
Do I have to take mana from my deck or hand when tapping a dual land?
Is there a distance limit for minecart tracks?
Do I have to know the General Relativity theory to understand the concept of inertial frame?
Why can't the Brexit deadlock in the UK parliament be solved with a plurality vote?
Difference between shutdown options
Why is the sun approximated as a black body at ~ 5800 K?
Confusion over Hunter with Crossbow Expert and Giant Killer
How many people need to be born every 8 years to sustain population?
How to understand "he realized a split second too late was also a mistake"
What happens if I try to grapple mirror image?
How to I force windows to use a specific version of SQLCMD?
Language involving irrational number is not a CFL
Are Captain Marvel's powers affected by Thanos breaking the Tesseract and claiming the stone?
Quoting Keynes in a lecture
Did I make a mistake by ccing email to boss to others?
Why do Radio Buttons not fill the entire outer circle?
Make a Bowl of Alphabet Soup
Grepping string, but include all non-blank lines following each grep match
What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?
Can I run 125kHz RF circuit on a breadboard?
If the only attacker is removed from combat, is a creature still counted as having attacked this turn?
What does "tick" mean in this sentence?
Echo with obfuscation
Unable to disable Microsoft Store in domain environment
reading data from large file into struct in C
How can I read and parse CSV files in C++?What's the difference between struct and class in .NET?When should you use a class vs a struct in C++?Why isn't sizeof for a struct equal to the sum of sizeof of each member?Why are mutable structs “evil”?When to use struct?Difference between 'struct' and 'typedef struct' in C++?problem with flushing input stream Ctypedef struct vs struct definitions.dat structured file handling in C (manual ?)Loading binary file to unknown struct type
I am a beginner to C programming. I need to efficiently read millions of from a file using struct in a file. Below is the example of input file.
2,33.1609992980957,26.59000015258789,8.003999710083008
5,15.85200023651123,13.036999702453613,31.801000595092773
8,10.907999992370605,32.000999450683594,1.8459999561309814
11,28.3700008392334,31.650999069213867,13.107999801635742
I have a current code shown in below, it is giving an error "Error in file"
suggesting the file is NULL but file has data.
#include<stdio.h>
#include<stdlib.h>
struct O_DATA
int index;
float x;
float y;
float z;
;
int main ()
FILE *infile ;
struct O_DATA input;
infile = fopen("input.dat", "r");
if (infile == NULL);
fprintf(stderr,"nError filen");
exit(1);
while(fread(&input, sizeof(struct O_DATA), 1, infile))
printf("Index = %d X= %f Y=%f Z=%f", input.index , input.x , input.y , input.z);
fclose(infile);
return 0;
I need to efficiently read and store data from an input file to process it further. Any help would be really appreciated. Thanks in advnace.
~
~
~
c struct
|
show 1 more comment
I am a beginner to C programming. I need to efficiently read millions of from a file using struct in a file. Below is the example of input file.
2,33.1609992980957,26.59000015258789,8.003999710083008
5,15.85200023651123,13.036999702453613,31.801000595092773
8,10.907999992370605,32.000999450683594,1.8459999561309814
11,28.3700008392334,31.650999069213867,13.107999801635742
I have a current code shown in below, it is giving an error "Error in file"
suggesting the file is NULL but file has data.
#include<stdio.h>
#include<stdlib.h>
struct O_DATA
int index;
float x;
float y;
float z;
;
int main ()
FILE *infile ;
struct O_DATA input;
infile = fopen("input.dat", "r");
if (infile == NULL);
fprintf(stderr,"nError filen");
exit(1);
while(fread(&input, sizeof(struct O_DATA), 1, infile))
printf("Index = %d X= %f Y=%f Z=%f", input.index , input.x , input.y , input.z);
fclose(infile);
return 0;
I need to efficiently read and store data from an input file to process it further. Any help would be really appreciated. Thanks in advnace.
~
~
~
c struct
2
Try printing the error (useperror
function to do it most simply). Most likely reason: current working directory is not what you think, it is something else, and that's why the file isn't found. Try using absolute path to input.dat to see if it helps.
– hyde
Mar 7 at 22:02
6
The example file is text! fread is NOT going to work.
– xing
Mar 7 at 22:06
2
Indeed. You will need to read it as text (fscanf
orfgets
+ some parsing.)
– Eugene Sh.
Mar 7 at 22:08
1
as @xing pointed out reading text files and parsing them into values is a several step process.
– Ahmed Masud
Mar 7 at 22:09
1
Why are you passingsizeof(struct O_DATA)
tofread
? The number of bytes you want to read from the file has nothing to do with how many bytes your platform uses to storestruct O_DATA
!
– David Schwartz
Mar 7 at 22:09
|
show 1 more comment
I am a beginner to C programming. I need to efficiently read millions of from a file using struct in a file. Below is the example of input file.
2,33.1609992980957,26.59000015258789,8.003999710083008
5,15.85200023651123,13.036999702453613,31.801000595092773
8,10.907999992370605,32.000999450683594,1.8459999561309814
11,28.3700008392334,31.650999069213867,13.107999801635742
I have a current code shown in below, it is giving an error "Error in file"
suggesting the file is NULL but file has data.
#include<stdio.h>
#include<stdlib.h>
struct O_DATA
int index;
float x;
float y;
float z;
;
int main ()
FILE *infile ;
struct O_DATA input;
infile = fopen("input.dat", "r");
if (infile == NULL);
fprintf(stderr,"nError filen");
exit(1);
while(fread(&input, sizeof(struct O_DATA), 1, infile))
printf("Index = %d X= %f Y=%f Z=%f", input.index , input.x , input.y , input.z);
fclose(infile);
return 0;
I need to efficiently read and store data from an input file to process it further. Any help would be really appreciated. Thanks in advnace.
~
~
~
c struct
I am a beginner to C programming. I need to efficiently read millions of from a file using struct in a file. Below is the example of input file.
2,33.1609992980957,26.59000015258789,8.003999710083008
5,15.85200023651123,13.036999702453613,31.801000595092773
8,10.907999992370605,32.000999450683594,1.8459999561309814
11,28.3700008392334,31.650999069213867,13.107999801635742
I have a current code shown in below, it is giving an error "Error in file"
suggesting the file is NULL but file has data.
#include<stdio.h>
#include<stdlib.h>
struct O_DATA
int index;
float x;
float y;
float z;
;
int main ()
FILE *infile ;
struct O_DATA input;
infile = fopen("input.dat", "r");
if (infile == NULL);
fprintf(stderr,"nError filen");
exit(1);
while(fread(&input, sizeof(struct O_DATA), 1, infile))
printf("Index = %d X= %f Y=%f Z=%f", input.index , input.x , input.y , input.z);
fclose(infile);
return 0;
I need to efficiently read and store data from an input file to process it further. Any help would be really appreciated. Thanks in advnace.
~
~
~
c struct
c struct
asked Mar 7 at 22:00
dipak sanapdipak sanap
11
11
2
Try printing the error (useperror
function to do it most simply). Most likely reason: current working directory is not what you think, it is something else, and that's why the file isn't found. Try using absolute path to input.dat to see if it helps.
– hyde
Mar 7 at 22:02
6
The example file is text! fread is NOT going to work.
– xing
Mar 7 at 22:06
2
Indeed. You will need to read it as text (fscanf
orfgets
+ some parsing.)
– Eugene Sh.
Mar 7 at 22:08
1
as @xing pointed out reading text files and parsing them into values is a several step process.
– Ahmed Masud
Mar 7 at 22:09
1
Why are you passingsizeof(struct O_DATA)
tofread
? The number of bytes you want to read from the file has nothing to do with how many bytes your platform uses to storestruct O_DATA
!
– David Schwartz
Mar 7 at 22:09
|
show 1 more comment
2
Try printing the error (useperror
function to do it most simply). Most likely reason: current working directory is not what you think, it is something else, and that's why the file isn't found. Try using absolute path to input.dat to see if it helps.
– hyde
Mar 7 at 22:02
6
The example file is text! fread is NOT going to work.
– xing
Mar 7 at 22:06
2
Indeed. You will need to read it as text (fscanf
orfgets
+ some parsing.)
– Eugene Sh.
Mar 7 at 22:08
1
as @xing pointed out reading text files and parsing them into values is a several step process.
– Ahmed Masud
Mar 7 at 22:09
1
Why are you passingsizeof(struct O_DATA)
tofread
? The number of bytes you want to read from the file has nothing to do with how many bytes your platform uses to storestruct O_DATA
!
– David Schwartz
Mar 7 at 22:09
2
2
Try printing the error (use
perror
function to do it most simply). Most likely reason: current working directory is not what you think, it is something else, and that's why the file isn't found. Try using absolute path to input.dat to see if it helps.– hyde
Mar 7 at 22:02
Try printing the error (use
perror
function to do it most simply). Most likely reason: current working directory is not what you think, it is something else, and that's why the file isn't found. Try using absolute path to input.dat to see if it helps.– hyde
Mar 7 at 22:02
6
6
The example file is text! fread is NOT going to work.
Mar 7 at 22:06
The example file is text! fread is NOT going to work.
Mar 7 at 22:06
2
2
Indeed. You will need to read it as text (
fscanf
or fgets
+ some parsing.)– Eugene Sh.
Mar 7 at 22:08
Indeed. You will need to read it as text (
fscanf
or fgets
+ some parsing.)– Eugene Sh.
Mar 7 at 22:08
1
1
as @xing pointed out reading text files and parsing them into values is a several step process.
– Ahmed Masud
Mar 7 at 22:09
as @xing pointed out reading text files and parsing them into values is a several step process.
– Ahmed Masud
Mar 7 at 22:09
1
1
Why are you passing
sizeof(struct O_DATA)
to fread
? The number of bytes you want to read from the file has nothing to do with how many bytes your platform uses to store struct O_DATA
!– David Schwartz
Mar 7 at 22:09
Why are you passing
sizeof(struct O_DATA)
to fread
? The number of bytes you want to read from the file has nothing to do with how many bytes your platform uses to store struct O_DATA
!– David Schwartz
Mar 7 at 22:09
|
show 1 more comment
5 Answers
5
active
oldest
votes
You've got an incorrect ;
after your if (infile == NULL)
test - try removing that...
[Edit: 2nd by 9 secs! :-)]
Hi, I did remove ";" and it runs however the output it wrong. Probably using fread is abad idea. WIll try to use fscanf or fgets
– dipak sanap
Mar 7 at 23:11
Yes, fread will just blindly read the size you say, while fscanf will let you take advantage of the formatted data, but read up on it carefully, as it’s easy to get the arguments wrong!
– Gwyn Evans
Mar 7 at 23:15
This is more of a comment and not really an answer to the problem.
– Ahmed Masud
Mar 7 at 23:22
Correct- my actual answer addressed the actual question that was asked (Why “Error file”) while this was a comment to suggest a direction for change.
– Gwyn Evans
Mar 7 at 23:33
add a comment |
if (infile == NULL);
/* floating block */
The above if
is a complete statement that does nothing regardless of the value of infile
. The "floating" block is executed no matter what infile
contains.
Remove the semicolon to 'attach' the "floating" block to the if
if (infile == NULL)
/* if block */
add a comment |
First figure out how to convert one line of text to data
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct my_data
unsigned int index;
float x;
float y;
float z;
;
struct my_data *
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
if(sscanf(input, "%d,%f,%f,%f", &data->index, &data->x, &data->y, &data->z) != 7)
return NULL;
return data;
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
char *str = strdup(input); /* make a copy of the input line because we modify it */
if (!str) /* I couldn't make a copy so I'll die */
return NULL;
p = strtok (str, separators); /* use line for first call to strtok */
if (!p) goto err;
tmp.index = strtoul (p, NULL, 0); /* convert text to integer */
p = strtok (NULL, separators); /* strtok remembers line */
if (!p) goto err;
tmp.x = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.y = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.z = atof(p);
memcpy(data, &tmp, sizeof(tmp)); /* copy values out */
goto out;
err:
data = NULL;
out:
free (str);
return data;
int main()
struct my_data somedata;
deserialize_data(&somedata, "1,2.5,3.12,7.955", ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
Combine it with reading lines from a file:
just the main function here (insert the rest from the previous example)
int
main(int argc, char *argv[])
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t nread;
struct my_data somedata;
if (argc != 2)
fprintf(stderr, "Usage: %s <file>n", argv[0]);
exit(EXIT_FAILURE);
stream = fopen(argv[1], "r");
if (stream == NULL)
perror("fopen");
exit(EXIT_FAILURE);
while ((nread = getline(&line, &len, stream)) != -1)
deserialize_data(&somedata, line, ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
free(line);
fclose(stream);
exit(EXIT_SUCCESS);
notice that I used getline rather than fgets etc. because that's now the preferred way to read a line from a file. obviously you can change theprintf
with actual processing of the data.
– Ahmed Masud
Mar 7 at 23:24
1
Wouldn’t just using fscanf be simpler and more straightforward?
– Gwyn Evans
Mar 7 at 23:29
@GwynEvans true that... not sure what I was thinking :P
– Ahmed Masud
Mar 7 at 23:35
The code withfscanf
don't care about lines (sincefscanf
deals withn
like with space characters).
– Basile Starynkevitch
Mar 8 at 0:00
Thanks everyone for the help! I am able to print the content of the file as desired. However, I need to save the data in the same struct as O_data so that I can work with it outside int main. Can someone help with that? Ideally, I should be able to access ith point outside like somedata[i].
– dipak sanap
3 hours ago
|
show 1 more comment
You already have solid responses in regard to syntax/structs/etc, but I will offer another method for reading the data in the file itself: I like Martin York's CSVIterator solution. This is my go-to approach for CSV processing because it requires less code to implement and has the added benefit of being easily modifiable (i.e., you can edit the CSVRow and CSVIterator defs depending on your needs).
Here's a mostly complete example using Martin's unedited code without structs or classes. In my opinion, and especially so as a beginner, it is easier to start developing your code with simpler techniques. As your code begins to take shape, it is much clearer why and where you need to implement more abstract/advanced devices.
Note this would technically need to be compiled with C++11 or greater because of my use of std::stod (and maybe some other stuff too I am forgetting), so take that into consideration:
//your includes
//...
#include"wherever_CSVIterator_is.h"
int main (int argc, char* argv[])
int index;
double tmp[3]; //since we know the shape of your input data
std::vector<double*> saved = std::vector<double*>();
std::vector<int> indices;
std::ifstream file(argv[1]);
for (CSVIterator loop(file); loop != CSVIterator(); ++loop) //loop over rows
index = (*loop)[0];
indices.push_back(index); //store int index first, always col 0
for (int k=1; k < (*loop).size(); k++) //loop across columns
tmp[k-1] = std::stod((*loop)[k]); //save double values now
saved.push_back(tmp);
/*now we have two vectors of the same 'size'
(let's pretend I wrote a check here to confirm this is true),
so we loop through them together and access with something like:*/
for (int j=0; j < (int)indices.size(); j++)
double* saved_ptr = saved.at(j); //get pointer to first elem of each triplet
printf("nindex: %g
Less fuss to write, but more dangerous (if saved[] goes out of scope, we are in trouble). Also some unnecessary copying is present, but we benefit from using std::vector containers in lieu of knowing exactly how much memory we need to allocate.
add a comment |
Don't give an example of input file. Specify your input file format -at least on paper or in comments- e.g. in EBNF notation (since your example is textual... it is not a binary file). Decide if the numbers have to be in different lines (or if you might accept a file with a single huge line made of million bytes; read about the Comma Separated Values format). Then, code some parser for that format. In your case, it is likely that some very simple recursive descent parsing is enough (and your particular parser won't even use recursion).
Read more about <stdio.h>
and its routines. Take time to carefully read that documentation. Since your input is textual, not binary, you don't need fread. Notice that input routines can fail, and you should handle the failure case.
Of course, fopen
can fail (e.g. because your working directory is not what you believe it is). You'll better use perror or errno to find more about the failure cause. So at least code:
infile = fopen("input.dat", "r");
if (infile == NULL)
perror("fopen input.dat");
exit(EXIT_FAILURE);
Notice that semi-colons (or their absence) are very important in C (no semi-colon after condition of if
). Read again the basic syntax of C language. Read about How to debug small programs. Enable all warnings and debug info when compiling (with GCC, compile with gcc -Wall -g
at least). The compiler warnings are very useful!
Remember that fscanf don't handle the end of line (newline) differently from a space character. So if the input has to have different lines you need to read every line separately.
You'll probably read every line using fgets (or getline) and parse every line individually. You could do that parsing with the help of sscanf (perhaps the %n
could be useful) - and you want to use the return count of sscanf
. You could also perhaps use strtok and/or strtod to do such a parsing.
Make sure that your parsing and your entire program is correct. With current computers (they are very fast, and most of the time your input file sits in the page cache) it is very likely that it would be fast enough. A million lines can be read pretty quickly (if on Linux, you could compare your parsing time with the time used by wc to count the lines of your file). On my computer (a powerful Linux desktop with AMD2970WX processor -it has lots of cores, but your program uses only one-, 64Gbytes of RAM, and SSD disk) a million lines can be read (by wc
) in less than 30 milliseconds, so I am guessing your entire program should run in less than half a second, if given a million lines of input, and if the further processing is simple (in linear time).
You are likely to fill a large array of struct O_DATA
and that array should probably be dynamically allocated, and reallocated when needed. Read more about C dynamic memory allocation. Read carefully about C memory management routines. They could fail, and you need to handle that failure (even if it is very unlikely to happen). You certainly don't want to re-allocate that array at every loop. You probably could allocate it in some geometrical progression (e.g. if the size of that array is size
, you'll call realloc
or a new malloc
for some int newsize = 4*size/3 + 10;
only when the old size
is too small). Of course, your array will generally be a bit larger than what is really needed, but memory is quite cheap and you are allowed to "lose" some of it.
But StackOverflow is not a "do my homework" site. I gave some advice above, but you should do your homework.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55053476%2freading-data-from-large-file-into-struct-in-c%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
You've got an incorrect ;
after your if (infile == NULL)
test - try removing that...
[Edit: 2nd by 9 secs! :-)]
Hi, I did remove ";" and it runs however the output it wrong. Probably using fread is abad idea. WIll try to use fscanf or fgets
– dipak sanap
Mar 7 at 23:11
Yes, fread will just blindly read the size you say, while fscanf will let you take advantage of the formatted data, but read up on it carefully, as it’s easy to get the arguments wrong!
– Gwyn Evans
Mar 7 at 23:15
This is more of a comment and not really an answer to the problem.
– Ahmed Masud
Mar 7 at 23:22
Correct- my actual answer addressed the actual question that was asked (Why “Error file”) while this was a comment to suggest a direction for change.
– Gwyn Evans
Mar 7 at 23:33
add a comment |
You've got an incorrect ;
after your if (infile == NULL)
test - try removing that...
[Edit: 2nd by 9 secs! :-)]
Hi, I did remove ";" and it runs however the output it wrong. Probably using fread is abad idea. WIll try to use fscanf or fgets
– dipak sanap
Mar 7 at 23:11
Yes, fread will just blindly read the size you say, while fscanf will let you take advantage of the formatted data, but read up on it carefully, as it’s easy to get the arguments wrong!
– Gwyn Evans
Mar 7 at 23:15
This is more of a comment and not really an answer to the problem.
– Ahmed Masud
Mar 7 at 23:22
Correct- my actual answer addressed the actual question that was asked (Why “Error file”) while this was a comment to suggest a direction for change.
– Gwyn Evans
Mar 7 at 23:33
add a comment |
You've got an incorrect ;
after your if (infile == NULL)
test - try removing that...
[Edit: 2nd by 9 secs! :-)]
You've got an incorrect ;
after your if (infile == NULL)
test - try removing that...
[Edit: 2nd by 9 secs! :-)]
answered Mar 7 at 22:05
Gwyn EvansGwyn Evans
989416
989416
Hi, I did remove ";" and it runs however the output it wrong. Probably using fread is abad idea. WIll try to use fscanf or fgets
– dipak sanap
Mar 7 at 23:11
Yes, fread will just blindly read the size you say, while fscanf will let you take advantage of the formatted data, but read up on it carefully, as it’s easy to get the arguments wrong!
– Gwyn Evans
Mar 7 at 23:15
This is more of a comment and not really an answer to the problem.
– Ahmed Masud
Mar 7 at 23:22
Correct- my actual answer addressed the actual question that was asked (Why “Error file”) while this was a comment to suggest a direction for change.
– Gwyn Evans
Mar 7 at 23:33
add a comment |
Hi, I did remove ";" and it runs however the output it wrong. Probably using fread is abad idea. WIll try to use fscanf or fgets
– dipak sanap
Mar 7 at 23:11
Yes, fread will just blindly read the size you say, while fscanf will let you take advantage of the formatted data, but read up on it carefully, as it’s easy to get the arguments wrong!
– Gwyn Evans
Mar 7 at 23:15
This is more of a comment and not really an answer to the problem.
– Ahmed Masud
Mar 7 at 23:22
Correct- my actual answer addressed the actual question that was asked (Why “Error file”) while this was a comment to suggest a direction for change.
– Gwyn Evans
Mar 7 at 23:33
Hi, I did remove ";" and it runs however the output it wrong. Probably using fread is abad idea. WIll try to use fscanf or fgets
– dipak sanap
Mar 7 at 23:11
Hi, I did remove ";" and it runs however the output it wrong. Probably using fread is abad idea. WIll try to use fscanf or fgets
– dipak sanap
Mar 7 at 23:11
Yes, fread will just blindly read the size you say, while fscanf will let you take advantage of the formatted data, but read up on it carefully, as it’s easy to get the arguments wrong!
– Gwyn Evans
Mar 7 at 23:15
Yes, fread will just blindly read the size you say, while fscanf will let you take advantage of the formatted data, but read up on it carefully, as it’s easy to get the arguments wrong!
– Gwyn Evans
Mar 7 at 23:15
This is more of a comment and not really an answer to the problem.
– Ahmed Masud
Mar 7 at 23:22
This is more of a comment and not really an answer to the problem.
– Ahmed Masud
Mar 7 at 23:22
Correct- my actual answer addressed the actual question that was asked (Why “Error file”) while this was a comment to suggest a direction for change.
– Gwyn Evans
Mar 7 at 23:33
Correct- my actual answer addressed the actual question that was asked (Why “Error file”) while this was a comment to suggest a direction for change.
– Gwyn Evans
Mar 7 at 23:33
add a comment |
if (infile == NULL);
/* floating block */
The above if
is a complete statement that does nothing regardless of the value of infile
. The "floating" block is executed no matter what infile
contains.
Remove the semicolon to 'attach' the "floating" block to the if
if (infile == NULL)
/* if block */
add a comment |
if (infile == NULL);
/* floating block */
The above if
is a complete statement that does nothing regardless of the value of infile
. The "floating" block is executed no matter what infile
contains.
Remove the semicolon to 'attach' the "floating" block to the if
if (infile == NULL)
/* if block */
add a comment |
if (infile == NULL);
/* floating block */
The above if
is a complete statement that does nothing regardless of the value of infile
. The "floating" block is executed no matter what infile
contains.
Remove the semicolon to 'attach' the "floating" block to the if
if (infile == NULL)
/* if block */
if (infile == NULL);
/* floating block */
The above if
is a complete statement that does nothing regardless of the value of infile
. The "floating" block is executed no matter what infile
contains.
Remove the semicolon to 'attach' the "floating" block to the if
if (infile == NULL)
/* if block */
edited Mar 7 at 22:10
answered Mar 7 at 22:05
pmgpmg
84.5k999170
84.5k999170
add a comment |
add a comment |
First figure out how to convert one line of text to data
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct my_data
unsigned int index;
float x;
float y;
float z;
;
struct my_data *
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
if(sscanf(input, "%d,%f,%f,%f", &data->index, &data->x, &data->y, &data->z) != 7)
return NULL;
return data;
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
char *str = strdup(input); /* make a copy of the input line because we modify it */
if (!str) /* I couldn't make a copy so I'll die */
return NULL;
p = strtok (str, separators); /* use line for first call to strtok */
if (!p) goto err;
tmp.index = strtoul (p, NULL, 0); /* convert text to integer */
p = strtok (NULL, separators); /* strtok remembers line */
if (!p) goto err;
tmp.x = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.y = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.z = atof(p);
memcpy(data, &tmp, sizeof(tmp)); /* copy values out */
goto out;
err:
data = NULL;
out:
free (str);
return data;
int main()
struct my_data somedata;
deserialize_data(&somedata, "1,2.5,3.12,7.955", ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
Combine it with reading lines from a file:
just the main function here (insert the rest from the previous example)
int
main(int argc, char *argv[])
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t nread;
struct my_data somedata;
if (argc != 2)
fprintf(stderr, "Usage: %s <file>n", argv[0]);
exit(EXIT_FAILURE);
stream = fopen(argv[1], "r");
if (stream == NULL)
perror("fopen");
exit(EXIT_FAILURE);
while ((nread = getline(&line, &len, stream)) != -1)
deserialize_data(&somedata, line, ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
free(line);
fclose(stream);
exit(EXIT_SUCCESS);
notice that I used getline rather than fgets etc. because that's now the preferred way to read a line from a file. obviously you can change theprintf
with actual processing of the data.
– Ahmed Masud
Mar 7 at 23:24
1
Wouldn’t just using fscanf be simpler and more straightforward?
– Gwyn Evans
Mar 7 at 23:29
@GwynEvans true that... not sure what I was thinking :P
– Ahmed Masud
Mar 7 at 23:35
The code withfscanf
don't care about lines (sincefscanf
deals withn
like with space characters).
– Basile Starynkevitch
Mar 8 at 0:00
Thanks everyone for the help! I am able to print the content of the file as desired. However, I need to save the data in the same struct as O_data so that I can work with it outside int main. Can someone help with that? Ideally, I should be able to access ith point outside like somedata[i].
– dipak sanap
3 hours ago
|
show 1 more comment
First figure out how to convert one line of text to data
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct my_data
unsigned int index;
float x;
float y;
float z;
;
struct my_data *
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
if(sscanf(input, "%d,%f,%f,%f", &data->index, &data->x, &data->y, &data->z) != 7)
return NULL;
return data;
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
char *str = strdup(input); /* make a copy of the input line because we modify it */
if (!str) /* I couldn't make a copy so I'll die */
return NULL;
p = strtok (str, separators); /* use line for first call to strtok */
if (!p) goto err;
tmp.index = strtoul (p, NULL, 0); /* convert text to integer */
p = strtok (NULL, separators); /* strtok remembers line */
if (!p) goto err;
tmp.x = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.y = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.z = atof(p);
memcpy(data, &tmp, sizeof(tmp)); /* copy values out */
goto out;
err:
data = NULL;
out:
free (str);
return data;
int main()
struct my_data somedata;
deserialize_data(&somedata, "1,2.5,3.12,7.955", ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
Combine it with reading lines from a file:
just the main function here (insert the rest from the previous example)
int
main(int argc, char *argv[])
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t nread;
struct my_data somedata;
if (argc != 2)
fprintf(stderr, "Usage: %s <file>n", argv[0]);
exit(EXIT_FAILURE);
stream = fopen(argv[1], "r");
if (stream == NULL)
perror("fopen");
exit(EXIT_FAILURE);
while ((nread = getline(&line, &len, stream)) != -1)
deserialize_data(&somedata, line, ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
free(line);
fclose(stream);
exit(EXIT_SUCCESS);
notice that I used getline rather than fgets etc. because that's now the preferred way to read a line from a file. obviously you can change theprintf
with actual processing of the data.
– Ahmed Masud
Mar 7 at 23:24
1
Wouldn’t just using fscanf be simpler and more straightforward?
– Gwyn Evans
Mar 7 at 23:29
@GwynEvans true that... not sure what I was thinking :P
– Ahmed Masud
Mar 7 at 23:35
The code withfscanf
don't care about lines (sincefscanf
deals withn
like with space characters).
– Basile Starynkevitch
Mar 8 at 0:00
Thanks everyone for the help! I am able to print the content of the file as desired. However, I need to save the data in the same struct as O_data so that I can work with it outside int main. Can someone help with that? Ideally, I should be able to access ith point outside like somedata[i].
– dipak sanap
3 hours ago
|
show 1 more comment
First figure out how to convert one line of text to data
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct my_data
unsigned int index;
float x;
float y;
float z;
;
struct my_data *
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
if(sscanf(input, "%d,%f,%f,%f", &data->index, &data->x, &data->y, &data->z) != 7)
return NULL;
return data;
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
char *str = strdup(input); /* make a copy of the input line because we modify it */
if (!str) /* I couldn't make a copy so I'll die */
return NULL;
p = strtok (str, separators); /* use line for first call to strtok */
if (!p) goto err;
tmp.index = strtoul (p, NULL, 0); /* convert text to integer */
p = strtok (NULL, separators); /* strtok remembers line */
if (!p) goto err;
tmp.x = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.y = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.z = atof(p);
memcpy(data, &tmp, sizeof(tmp)); /* copy values out */
goto out;
err:
data = NULL;
out:
free (str);
return data;
int main()
struct my_data somedata;
deserialize_data(&somedata, "1,2.5,3.12,7.955", ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
Combine it with reading lines from a file:
just the main function here (insert the rest from the previous example)
int
main(int argc, char *argv[])
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t nread;
struct my_data somedata;
if (argc != 2)
fprintf(stderr, "Usage: %s <file>n", argv[0]);
exit(EXIT_FAILURE);
stream = fopen(argv[1], "r");
if (stream == NULL)
perror("fopen");
exit(EXIT_FAILURE);
while ((nread = getline(&line, &len, stream)) != -1)
deserialize_data(&somedata, line, ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
free(line);
fclose(stream);
exit(EXIT_SUCCESS);
First figure out how to convert one line of text to data
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct my_data
unsigned int index;
float x;
float y;
float z;
;
struct my_data *
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
if(sscanf(input, "%d,%f,%f,%f", &data->index, &data->x, &data->y, &data->z) != 7)
return NULL;
return data;
deserialize_data(struct my_data *data, const char *input, const char *separators)
char *p;
struct my_data tmp;
char *str = strdup(input); /* make a copy of the input line because we modify it */
if (!str) /* I couldn't make a copy so I'll die */
return NULL;
p = strtok (str, separators); /* use line for first call to strtok */
if (!p) goto err;
tmp.index = strtoul (p, NULL, 0); /* convert text to integer */
p = strtok (NULL, separators); /* strtok remembers line */
if (!p) goto err;
tmp.x = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.y = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.z = atof(p);
memcpy(data, &tmp, sizeof(tmp)); /* copy values out */
goto out;
err:
data = NULL;
out:
free (str);
return data;
int main()
struct my_data somedata;
deserialize_data(&somedata, "1,2.5,3.12,7.955", ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
Combine it with reading lines from a file:
just the main function here (insert the rest from the previous example)
int
main(int argc, char *argv[])
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t nread;
struct my_data somedata;
if (argc != 2)
fprintf(stderr, "Usage: %s <file>n", argv[0]);
exit(EXIT_FAILURE);
stream = fopen(argv[1], "r");
if (stream == NULL)
perror("fopen");
exit(EXIT_FAILURE);
while ((nread = getline(&line, &len, stream)) != -1)
deserialize_data(&somedata, line, ",");
printf("index: %d, x: %2f, y: %2f, z: %2fn", somedata.index, somedata.x, somedata.y, somedata.z);
free(line);
fclose(stream);
exit(EXIT_SUCCESS);
edited Mar 7 at 23:42
answered Mar 7 at 23:07
Ahmed MasudAhmed Masud
15.8k32538
15.8k32538
notice that I used getline rather than fgets etc. because that's now the preferred way to read a line from a file. obviously you can change theprintf
with actual processing of the data.
– Ahmed Masud
Mar 7 at 23:24
1
Wouldn’t just using fscanf be simpler and more straightforward?
– Gwyn Evans
Mar 7 at 23:29
@GwynEvans true that... not sure what I was thinking :P
– Ahmed Masud
Mar 7 at 23:35
The code withfscanf
don't care about lines (sincefscanf
deals withn
like with space characters).
– Basile Starynkevitch
Mar 8 at 0:00
Thanks everyone for the help! I am able to print the content of the file as desired. However, I need to save the data in the same struct as O_data so that I can work with it outside int main. Can someone help with that? Ideally, I should be able to access ith point outside like somedata[i].
– dipak sanap
3 hours ago
|
show 1 more comment
notice that I used getline rather than fgets etc. because that's now the preferred way to read a line from a file. obviously you can change theprintf
with actual processing of the data.
– Ahmed Masud
Mar 7 at 23:24
1
Wouldn’t just using fscanf be simpler and more straightforward?
– Gwyn Evans
Mar 7 at 23:29
@GwynEvans true that... not sure what I was thinking :P
– Ahmed Masud
Mar 7 at 23:35
The code withfscanf
don't care about lines (sincefscanf
deals withn
like with space characters).
– Basile Starynkevitch
Mar 8 at 0:00
Thanks everyone for the help! I am able to print the content of the file as desired. However, I need to save the data in the same struct as O_data so that I can work with it outside int main. Can someone help with that? Ideally, I should be able to access ith point outside like somedata[i].
– dipak sanap
3 hours ago
notice that I used getline rather than fgets etc. because that's now the preferred way to read a line from a file. obviously you can change the
printf
with actual processing of the data.– Ahmed Masud
Mar 7 at 23:24
notice that I used getline rather than fgets etc. because that's now the preferred way to read a line from a file. obviously you can change the
printf
with actual processing of the data.– Ahmed Masud
Mar 7 at 23:24
1
1
Wouldn’t just using fscanf be simpler and more straightforward?
– Gwyn Evans
Mar 7 at 23:29
Wouldn’t just using fscanf be simpler and more straightforward?
– Gwyn Evans
Mar 7 at 23:29
@GwynEvans true that... not sure what I was thinking :P
– Ahmed Masud
Mar 7 at 23:35
@GwynEvans true that... not sure what I was thinking :P
– Ahmed Masud
Mar 7 at 23:35
The code with
fscanf
don't care about lines (since fscanf
deals with n
like with space characters).– Basile Starynkevitch
Mar 8 at 0:00
The code with
fscanf
don't care about lines (since fscanf
deals with n
like with space characters).– Basile Starynkevitch
Mar 8 at 0:00
Thanks everyone for the help! I am able to print the content of the file as desired. However, I need to save the data in the same struct as O_data so that I can work with it outside int main. Can someone help with that? Ideally, I should be able to access ith point outside like somedata[i].
– dipak sanap
3 hours ago
Thanks everyone for the help! I am able to print the content of the file as desired. However, I need to save the data in the same struct as O_data so that I can work with it outside int main. Can someone help with that? Ideally, I should be able to access ith point outside like somedata[i].
– dipak sanap
3 hours ago
|
show 1 more comment
You already have solid responses in regard to syntax/structs/etc, but I will offer another method for reading the data in the file itself: I like Martin York's CSVIterator solution. This is my go-to approach for CSV processing because it requires less code to implement and has the added benefit of being easily modifiable (i.e., you can edit the CSVRow and CSVIterator defs depending on your needs).
Here's a mostly complete example using Martin's unedited code without structs or classes. In my opinion, and especially so as a beginner, it is easier to start developing your code with simpler techniques. As your code begins to take shape, it is much clearer why and where you need to implement more abstract/advanced devices.
Note this would technically need to be compiled with C++11 or greater because of my use of std::stod (and maybe some other stuff too I am forgetting), so take that into consideration:
//your includes
//...
#include"wherever_CSVIterator_is.h"
int main (int argc, char* argv[])
int index;
double tmp[3]; //since we know the shape of your input data
std::vector<double*> saved = std::vector<double*>();
std::vector<int> indices;
std::ifstream file(argv[1]);
for (CSVIterator loop(file); loop != CSVIterator(); ++loop) //loop over rows
index = (*loop)[0];
indices.push_back(index); //store int index first, always col 0
for (int k=1; k < (*loop).size(); k++) //loop across columns
tmp[k-1] = std::stod((*loop)[k]); //save double values now
saved.push_back(tmp);
/*now we have two vectors of the same 'size'
(let's pretend I wrote a check here to confirm this is true),
so we loop through them together and access with something like:*/
for (int j=0; j < (int)indices.size(); j++)
double* saved_ptr = saved.at(j); //get pointer to first elem of each triplet
printf("nindex: %g
Less fuss to write, but more dangerous (if saved[] goes out of scope, we are in trouble). Also some unnecessary copying is present, but we benefit from using std::vector containers in lieu of knowing exactly how much memory we need to allocate.
add a comment |
You already have solid responses in regard to syntax/structs/etc, but I will offer another method for reading the data in the file itself: I like Martin York's CSVIterator solution. This is my go-to approach for CSV processing because it requires less code to implement and has the added benefit of being easily modifiable (i.e., you can edit the CSVRow and CSVIterator defs depending on your needs).
Here's a mostly complete example using Martin's unedited code without structs or classes. In my opinion, and especially so as a beginner, it is easier to start developing your code with simpler techniques. As your code begins to take shape, it is much clearer why and where you need to implement more abstract/advanced devices.
Note this would technically need to be compiled with C++11 or greater because of my use of std::stod (and maybe some other stuff too I am forgetting), so take that into consideration:
//your includes
//...
#include"wherever_CSVIterator_is.h"
int main (int argc, char* argv[])
int index;
double tmp[3]; //since we know the shape of your input data
std::vector<double*> saved = std::vector<double*>();
std::vector<int> indices;
std::ifstream file(argv[1]);
for (CSVIterator loop(file); loop != CSVIterator(); ++loop) //loop over rows
index = (*loop)[0];
indices.push_back(index); //store int index first, always col 0
for (int k=1; k < (*loop).size(); k++) //loop across columns
tmp[k-1] = std::stod((*loop)[k]); //save double values now
saved.push_back(tmp);
/*now we have two vectors of the same 'size'
(let's pretend I wrote a check here to confirm this is true),
so we loop through them together and access with something like:*/
for (int j=0; j < (int)indices.size(); j++)
double* saved_ptr = saved.at(j); //get pointer to first elem of each triplet
printf("nindex: %g
Less fuss to write, but more dangerous (if saved[] goes out of scope, we are in trouble). Also some unnecessary copying is present, but we benefit from using std::vector containers in lieu of knowing exactly how much memory we need to allocate.
add a comment |
You already have solid responses in regard to syntax/structs/etc, but I will offer another method for reading the data in the file itself: I like Martin York's CSVIterator solution. This is my go-to approach for CSV processing because it requires less code to implement and has the added benefit of being easily modifiable (i.e., you can edit the CSVRow and CSVIterator defs depending on your needs).
Here's a mostly complete example using Martin's unedited code without structs or classes. In my opinion, and especially so as a beginner, it is easier to start developing your code with simpler techniques. As your code begins to take shape, it is much clearer why and where you need to implement more abstract/advanced devices.
Note this would technically need to be compiled with C++11 or greater because of my use of std::stod (and maybe some other stuff too I am forgetting), so take that into consideration:
//your includes
//...
#include"wherever_CSVIterator_is.h"
int main (int argc, char* argv[])
int index;
double tmp[3]; //since we know the shape of your input data
std::vector<double*> saved = std::vector<double*>();
std::vector<int> indices;
std::ifstream file(argv[1]);
for (CSVIterator loop(file); loop != CSVIterator(); ++loop) //loop over rows
index = (*loop)[0];
indices.push_back(index); //store int index first, always col 0
for (int k=1; k < (*loop).size(); k++) //loop across columns
tmp[k-1] = std::stod((*loop)[k]); //save double values now
saved.push_back(tmp);
/*now we have two vectors of the same 'size'
(let's pretend I wrote a check here to confirm this is true),
so we loop through them together and access with something like:*/
for (int j=0; j < (int)indices.size(); j++)
double* saved_ptr = saved.at(j); //get pointer to first elem of each triplet
printf("nindex: %g
Less fuss to write, but more dangerous (if saved[] goes out of scope, we are in trouble). Also some unnecessary copying is present, but we benefit from using std::vector containers in lieu of knowing exactly how much memory we need to allocate.
You already have solid responses in regard to syntax/structs/etc, but I will offer another method for reading the data in the file itself: I like Martin York's CSVIterator solution. This is my go-to approach for CSV processing because it requires less code to implement and has the added benefit of being easily modifiable (i.e., you can edit the CSVRow and CSVIterator defs depending on your needs).
Here's a mostly complete example using Martin's unedited code without structs or classes. In my opinion, and especially so as a beginner, it is easier to start developing your code with simpler techniques. As your code begins to take shape, it is much clearer why and where you need to implement more abstract/advanced devices.
Note this would technically need to be compiled with C++11 or greater because of my use of std::stod (and maybe some other stuff too I am forgetting), so take that into consideration:
//your includes
//...
#include"wherever_CSVIterator_is.h"
int main (int argc, char* argv[])
int index;
double tmp[3]; //since we know the shape of your input data
std::vector<double*> saved = std::vector<double*>();
std::vector<int> indices;
std::ifstream file(argv[1]);
for (CSVIterator loop(file); loop != CSVIterator(); ++loop) //loop over rows
index = (*loop)[0];
indices.push_back(index); //store int index first, always col 0
for (int k=1; k < (*loop).size(); k++) //loop across columns
tmp[k-1] = std::stod((*loop)[k]); //save double values now
saved.push_back(tmp);
/*now we have two vectors of the same 'size'
(let's pretend I wrote a check here to confirm this is true),
so we loop through them together and access with something like:*/
for (int j=0; j < (int)indices.size(); j++)
double* saved_ptr = saved.at(j); //get pointer to first elem of each triplet
printf("nindex: %g
Less fuss to write, but more dangerous (if saved[] goes out of scope, we are in trouble). Also some unnecessary copying is present, but we benefit from using std::vector containers in lieu of knowing exactly how much memory we need to allocate.
edited Mar 8 at 21:06
answered Mar 8 at 20:48
AlexAlex
112
112
add a comment |
add a comment |
Don't give an example of input file. Specify your input file format -at least on paper or in comments- e.g. in EBNF notation (since your example is textual... it is not a binary file). Decide if the numbers have to be in different lines (or if you might accept a file with a single huge line made of million bytes; read about the Comma Separated Values format). Then, code some parser for that format. In your case, it is likely that some very simple recursive descent parsing is enough (and your particular parser won't even use recursion).
Read more about <stdio.h>
and its routines. Take time to carefully read that documentation. Since your input is textual, not binary, you don't need fread. Notice that input routines can fail, and you should handle the failure case.
Of course, fopen
can fail (e.g. because your working directory is not what you believe it is). You'll better use perror or errno to find more about the failure cause. So at least code:
infile = fopen("input.dat", "r");
if (infile == NULL)
perror("fopen input.dat");
exit(EXIT_FAILURE);
Notice that semi-colons (or their absence) are very important in C (no semi-colon after condition of if
). Read again the basic syntax of C language. Read about How to debug small programs. Enable all warnings and debug info when compiling (with GCC, compile with gcc -Wall -g
at least). The compiler warnings are very useful!
Remember that fscanf don't handle the end of line (newline) differently from a space character. So if the input has to have different lines you need to read every line separately.
You'll probably read every line using fgets (or getline) and parse every line individually. You could do that parsing with the help of sscanf (perhaps the %n
could be useful) - and you want to use the return count of sscanf
. You could also perhaps use strtok and/or strtod to do such a parsing.
Make sure that your parsing and your entire program is correct. With current computers (they are very fast, and most of the time your input file sits in the page cache) it is very likely that it would be fast enough. A million lines can be read pretty quickly (if on Linux, you could compare your parsing time with the time used by wc to count the lines of your file). On my computer (a powerful Linux desktop with AMD2970WX processor -it has lots of cores, but your program uses only one-, 64Gbytes of RAM, and SSD disk) a million lines can be read (by wc
) in less than 30 milliseconds, so I am guessing your entire program should run in less than half a second, if given a million lines of input, and if the further processing is simple (in linear time).
You are likely to fill a large array of struct O_DATA
and that array should probably be dynamically allocated, and reallocated when needed. Read more about C dynamic memory allocation. Read carefully about C memory management routines. They could fail, and you need to handle that failure (even if it is very unlikely to happen). You certainly don't want to re-allocate that array at every loop. You probably could allocate it in some geometrical progression (e.g. if the size of that array is size
, you'll call realloc
or a new malloc
for some int newsize = 4*size/3 + 10;
only when the old size
is too small). Of course, your array will generally be a bit larger than what is really needed, but memory is quite cheap and you are allowed to "lose" some of it.
But StackOverflow is not a "do my homework" site. I gave some advice above, but you should do your homework.
add a comment |
Don't give an example of input file. Specify your input file format -at least on paper or in comments- e.g. in EBNF notation (since your example is textual... it is not a binary file). Decide if the numbers have to be in different lines (or if you might accept a file with a single huge line made of million bytes; read about the Comma Separated Values format). Then, code some parser for that format. In your case, it is likely that some very simple recursive descent parsing is enough (and your particular parser won't even use recursion).
Read more about <stdio.h>
and its routines. Take time to carefully read that documentation. Since your input is textual, not binary, you don't need fread. Notice that input routines can fail, and you should handle the failure case.
Of course, fopen
can fail (e.g. because your working directory is not what you believe it is). You'll better use perror or errno to find more about the failure cause. So at least code:
infile = fopen("input.dat", "r");
if (infile == NULL)
perror("fopen input.dat");
exit(EXIT_FAILURE);
Notice that semi-colons (or their absence) are very important in C (no semi-colon after condition of if
). Read again the basic syntax of C language. Read about How to debug small programs. Enable all warnings and debug info when compiling (with GCC, compile with gcc -Wall -g
at least). The compiler warnings are very useful!
Remember that fscanf don't handle the end of line (newline) differently from a space character. So if the input has to have different lines you need to read every line separately.
You'll probably read every line using fgets (or getline) and parse every line individually. You could do that parsing with the help of sscanf (perhaps the %n
could be useful) - and you want to use the return count of sscanf
. You could also perhaps use strtok and/or strtod to do such a parsing.
Make sure that your parsing and your entire program is correct. With current computers (they are very fast, and most of the time your input file sits in the page cache) it is very likely that it would be fast enough. A million lines can be read pretty quickly (if on Linux, you could compare your parsing time with the time used by wc to count the lines of your file). On my computer (a powerful Linux desktop with AMD2970WX processor -it has lots of cores, but your program uses only one-, 64Gbytes of RAM, and SSD disk) a million lines can be read (by wc
) in less than 30 milliseconds, so I am guessing your entire program should run in less than half a second, if given a million lines of input, and if the further processing is simple (in linear time).
You are likely to fill a large array of struct O_DATA
and that array should probably be dynamically allocated, and reallocated when needed. Read more about C dynamic memory allocation. Read carefully about C memory management routines. They could fail, and you need to handle that failure (even if it is very unlikely to happen). You certainly don't want to re-allocate that array at every loop. You probably could allocate it in some geometrical progression (e.g. if the size of that array is size
, you'll call realloc
or a new malloc
for some int newsize = 4*size/3 + 10;
only when the old size
is too small). Of course, your array will generally be a bit larger than what is really needed, but memory is quite cheap and you are allowed to "lose" some of it.
But StackOverflow is not a "do my homework" site. I gave some advice above, but you should do your homework.
add a comment |
Don't give an example of input file. Specify your input file format -at least on paper or in comments- e.g. in EBNF notation (since your example is textual... it is not a binary file). Decide if the numbers have to be in different lines (or if you might accept a file with a single huge line made of million bytes; read about the Comma Separated Values format). Then, code some parser for that format. In your case, it is likely that some very simple recursive descent parsing is enough (and your particular parser won't even use recursion).
Read more about <stdio.h>
and its routines. Take time to carefully read that documentation. Since your input is textual, not binary, you don't need fread. Notice that input routines can fail, and you should handle the failure case.
Of course, fopen
can fail (e.g. because your working directory is not what you believe it is). You'll better use perror or errno to find more about the failure cause. So at least code:
infile = fopen("input.dat", "r");
if (infile == NULL)
perror("fopen input.dat");
exit(EXIT_FAILURE);
Notice that semi-colons (or their absence) are very important in C (no semi-colon after condition of if
). Read again the basic syntax of C language. Read about How to debug small programs. Enable all warnings and debug info when compiling (with GCC, compile with gcc -Wall -g
at least). The compiler warnings are very useful!
Remember that fscanf don't handle the end of line (newline) differently from a space character. So if the input has to have different lines you need to read every line separately.
You'll probably read every line using fgets (or getline) and parse every line individually. You could do that parsing with the help of sscanf (perhaps the %n
could be useful) - and you want to use the return count of sscanf
. You could also perhaps use strtok and/or strtod to do such a parsing.
Make sure that your parsing and your entire program is correct. With current computers (they are very fast, and most of the time your input file sits in the page cache) it is very likely that it would be fast enough. A million lines can be read pretty quickly (if on Linux, you could compare your parsing time with the time used by wc to count the lines of your file). On my computer (a powerful Linux desktop with AMD2970WX processor -it has lots of cores, but your program uses only one-, 64Gbytes of RAM, and SSD disk) a million lines can be read (by wc
) in less than 30 milliseconds, so I am guessing your entire program should run in less than half a second, if given a million lines of input, and if the further processing is simple (in linear time).
You are likely to fill a large array of struct O_DATA
and that array should probably be dynamically allocated, and reallocated when needed. Read more about C dynamic memory allocation. Read carefully about C memory management routines. They could fail, and you need to handle that failure (even if it is very unlikely to happen). You certainly don't want to re-allocate that array at every loop. You probably could allocate it in some geometrical progression (e.g. if the size of that array is size
, you'll call realloc
or a new malloc
for some int newsize = 4*size/3 + 10;
only when the old size
is too small). Of course, your array will generally be a bit larger than what is really needed, but memory is quite cheap and you are allowed to "lose" some of it.
But StackOverflow is not a "do my homework" site. I gave some advice above, but you should do your homework.
Don't give an example of input file. Specify your input file format -at least on paper or in comments- e.g. in EBNF notation (since your example is textual... it is not a binary file). Decide if the numbers have to be in different lines (or if you might accept a file with a single huge line made of million bytes; read about the Comma Separated Values format). Then, code some parser for that format. In your case, it is likely that some very simple recursive descent parsing is enough (and your particular parser won't even use recursion).
Read more about <stdio.h>
and its routines. Take time to carefully read that documentation. Since your input is textual, not binary, you don't need fread. Notice that input routines can fail, and you should handle the failure case.
Of course, fopen
can fail (e.g. because your working directory is not what you believe it is). You'll better use perror or errno to find more about the failure cause. So at least code:
infile = fopen("input.dat", "r");
if (infile == NULL)
perror("fopen input.dat");
exit(EXIT_FAILURE);
Notice that semi-colons (or their absence) are very important in C (no semi-colon after condition of if
). Read again the basic syntax of C language. Read about How to debug small programs. Enable all warnings and debug info when compiling (with GCC, compile with gcc -Wall -g
at least). The compiler warnings are very useful!
Remember that fscanf don't handle the end of line (newline) differently from a space character. So if the input has to have different lines you need to read every line separately.
You'll probably read every line using fgets (or getline) and parse every line individually. You could do that parsing with the help of sscanf (perhaps the %n
could be useful) - and you want to use the return count of sscanf
. You could also perhaps use strtok and/or strtod to do such a parsing.
Make sure that your parsing and your entire program is correct. With current computers (they are very fast, and most of the time your input file sits in the page cache) it is very likely that it would be fast enough. A million lines can be read pretty quickly (if on Linux, you could compare your parsing time with the time used by wc to count the lines of your file). On my computer (a powerful Linux desktop with AMD2970WX processor -it has lots of cores, but your program uses only one-, 64Gbytes of RAM, and SSD disk) a million lines can be read (by wc
) in less than 30 milliseconds, so I am guessing your entire program should run in less than half a second, if given a million lines of input, and if the further processing is simple (in linear time).
You are likely to fill a large array of struct O_DATA
and that array should probably be dynamically allocated, and reallocated when needed. Read more about C dynamic memory allocation. Read carefully about C memory management routines. They could fail, and you need to handle that failure (even if it is very unlikely to happen). You certainly don't want to re-allocate that array at every loop. You probably could allocate it in some geometrical progression (e.g. if the size of that array is size
, you'll call realloc
or a new malloc
for some int newsize = 4*size/3 + 10;
only when the old size
is too small). Of course, your array will generally be a bit larger than what is really needed, but memory is quite cheap and you are allowed to "lose" some of it.
But StackOverflow is not a "do my homework" site. I gave some advice above, but you should do your homework.
edited Mar 8 at 0:14
answered Mar 7 at 23:13
Basile StarynkevitchBasile Starynkevitch
179k13173374
179k13173374
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55053476%2freading-data-from-large-file-into-struct-in-c%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Try printing the error (use
perror
function to do it most simply). Most likely reason: current working directory is not what you think, it is something else, and that's why the file isn't found. Try using absolute path to input.dat to see if it helps.– hyde
Mar 7 at 22:02
6
The example file is text! fread is NOT going to work.
– xing
Mar 7 at 22:06
2
Indeed. You will need to read it as text (
fscanf
orfgets
+ some parsing.)– Eugene Sh.
Mar 7 at 22:08
1
as @xing pointed out reading text files and parsing them into values is a several step process.
– Ahmed Masud
Mar 7 at 22:09
1
Why are you passing
sizeof(struct O_DATA)
tofread
? The number of bytes you want to read from the file has nothing to do with how many bytes your platform uses to storestruct O_DATA
!– David Schwartz
Mar 7 at 22:09