Tensorflow premade estimator is much slower than custom?2019 Community Moderator ElectionWhy is reading lines from stdin much slower in C++ than Python?Tensorflow: Input pipeline with sparse data for the SVM estimatorHow to get train loss and evaluate loss every global step in Tensorflow Estimator?Tensorflow custom Estimator with Dataset API: embedding lookup (feature_column) NMT taskloading a tensorflow Estimator export_savedmodel() and predicting on tfrecord datasetUsing Tensorflow Estimator API with Images for SemSegTensorflow Estimator: loss not decreasing when using tf.feature_column.embedding_column for a list of categorical variablesStopping criteria for pre-made estimators in TensorFlowtensorflow estimator training only runs half of the stepsTraining Estimators less than one epoch using dataset API?

Should I file my taxes? No income, unemployed, but paid 2k in student loan interest

The (Easy) Road to Code

Tabular environment - text vertically positions itself by bottom of tikz picture in adjacent cell

Professor forcing me to attend a conference, I can't afford even with 50% funding

How can I portion out frozen cookie dough?

What is the orbit and expected lifetime of Crew Dragon trunk?

Who has more? Ireland or Iceland?

What is the best index strategy or query SELECT when performing a search/lookup BETWEEN IP address (IPv4 and IPv6) ranges?

Paper published similar to PhD thesis

Boss Telling direct supervisor I snitched

Can I challenge the interviewer to give me a proper technical feedback?

What is better: yes / no radio, or simple checkbox?

What exactly is the meaning of "fine wine"?

Vector-transposing function

Use Mercury as quenching liquid for swords?

Should we avoid writing fiction about historical events without extensive research?

Having the player face themselves after the mid-game

Can Witch Sight see through Mirror Image?

ESPP--any reason not to go all in?

Short story about cities being connected by a conveyor belt

Why do phishing e-mails use faked e-mail addresses instead of the real one?

“I had a flat in the centre of town, but I didn’t like living there, so …”

How does learning spells work when leveling a multiclass character?

What is Tony Stark injecting into himself in Iron Man 3?

Tensorflow premade estimator is much slower than custom?

2019 Community Moderator ElectionWhy is reading lines from stdin much slower in C++ than Python?Tensorflow: Input pipeline with sparse data for the SVM estimatorHow to get train loss and evaluate loss every global step in Tensorflow Estimator?Tensorflow custom Estimator with Dataset API: embedding lookup (feature_column) NMT taskloading a tensorflow Estimator export_savedmodel() and predicting on tfrecord datasetUsing Tensorflow Estimator API with Images for SemSegTensorflow Estimator: loss not decreasing when using tf.feature_column.embedding_column for a list of categorical variablesStopping criteria for pre-made estimators in TensorFlowtensorflow estimator training only runs half of the stepsTraining Estimators less than one epoch using dataset API?

I'm benchmarking general TF operations, and so to establish a baseline I'm trying to figure out how quickly I can train a simple logistic regression with a single pass of the training data. My input is a TFRecord file containing 860,000 sparse rows, with 164,000 one-hot encoded features. Data processing at bottom.

A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :

feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]

custom_config = tf.estimator.RunConfig(save_summary_steps=None,
 save_checkpoints_steps=None)

estimator = tf.estimator.LinearClassifier(
 feature_columns = feature_columns,
 model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
 config = custom_config
)

If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.

If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.

class LogisticModel(object):
 def __init__(self):

 self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
 self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))

 def __call__(self, x):

 return tf.sparse_tensor_dense_matmul(x,self.W) + self.B


def grad_fn(model, inputs, targets):

 with tf.GradientTape() as t:
 loss_val = loss_fn(model, inputs, targets)

 return t.gradient(loss_val, [model.W, model.B])


def loss_fn(model, inputs, targets):

 target_size = targets.shape.as_list()[0]
 return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
 logits = model(inputs))) 
 )

def perform_train(model, optim, dataset):

 step = 0
 for x, y in dataset:
 grads = grad_fn(lm, x, y)
 optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
 global_step = tf.train.get_or_create_global_step())

 if step % 20 == 0:
 print(f"Step step: loss_fn(lm,x,y)")
 step += 1

1) What could account for this huge speed difference? Is overhead in the estimators that significant?
2) Without reading into memory, can my custom function be improved further? An in-house C++-based library is still an order of magnitude faster.

Data generation:

def ex_to_tensors(ex, tensor_size):

 feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
 value_key='values',
 dtype=tf.int64,
 size=tensor_size),
 'label': tf.FixedLenFeature([], tf.int64, default_value=0)
 

 parsed_dict = tf.parse_single_example(ex, feature_spec)

 return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)


def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):

 def parseTensors(x):
 return ex_to_tensors(x,feature_size)

 dataset = (tf.data.TFRecordDataset(filenames)

 .map(parseTensors)
 .batch(batch_size)
 )

 return dataset

asked 2 days ago

Patrick McCarthy

1,33111331

add a comment |

A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :

feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]

custom_config = tf.estimator.RunConfig(save_summary_steps=None,
 save_checkpoints_steps=None)

estimator = tf.estimator.LinearClassifier(
 feature_columns = feature_columns,
 model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
 config = custom_config
)

If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.

If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.

class LogisticModel(object):
 def __init__(self):

 self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
 self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))

 def __call__(self, x):

 return tf.sparse_tensor_dense_matmul(x,self.W) + self.B


def grad_fn(model, inputs, targets):

 with tf.GradientTape() as t:
 loss_val = loss_fn(model, inputs, targets)

 return t.gradient(loss_val, [model.W, model.B])


def loss_fn(model, inputs, targets):

 target_size = targets.shape.as_list()[0]
 return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
 logits = model(inputs))) 
 )

def perform_train(model, optim, dataset):

 step = 0
 for x, y in dataset:
 grads = grad_fn(lm, x, y)
 optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
 global_step = tf.train.get_or_create_global_step())

 if step % 20 == 0:
 print(f"Step step: loss_fn(lm,x,y)")
 step += 1

Data generation:

def ex_to_tensors(ex, tensor_size):

 feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
 value_key='values',
 dtype=tf.int64,
 size=tensor_size),
 'label': tf.FixedLenFeature([], tf.int64, default_value=0)
 

 parsed_dict = tf.parse_single_example(ex, feature_spec)

 return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)


def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):

 def parseTensors(x):
 return ex_to_tensors(x,feature_size)

 dataset = (tf.data.TFRecordDataset(filenames)

 .map(parseTensors)
 .batch(batch_size)
 )

 return dataset

asked 2 days ago

Patrick McCarthy

1,33111331

add a comment |

A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :

feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]

custom_config = tf.estimator.RunConfig(save_summary_steps=None,
 save_checkpoints_steps=None)

estimator = tf.estimator.LinearClassifier(
 feature_columns = feature_columns,
 model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
 config = custom_config
)

If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.

If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.

class LogisticModel(object):
 def __init__(self):

 self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
 self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))

 def __call__(self, x):

 return tf.sparse_tensor_dense_matmul(x,self.W) + self.B


def grad_fn(model, inputs, targets):

 with tf.GradientTape() as t:
 loss_val = loss_fn(model, inputs, targets)

 return t.gradient(loss_val, [model.W, model.B])


def loss_fn(model, inputs, targets):

 target_size = targets.shape.as_list()[0]
 return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
 logits = model(inputs))) 
 )

def perform_train(model, optim, dataset):

 step = 0
 for x, y in dataset:
 grads = grad_fn(lm, x, y)
 optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
 global_step = tf.train.get_or_create_global_step())

 if step % 20 == 0:
 print(f"Step step: loss_fn(lm,x,y)")
 step += 1

Data generation:

def ex_to_tensors(ex, tensor_size):

 feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
 value_key='values',
 dtype=tf.int64,
 size=tensor_size),
 'label': tf.FixedLenFeature([], tf.int64, default_value=0)
 

 parsed_dict = tf.parse_single_example(ex, feature_spec)

 return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)


def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):

 def parseTensors(x):
 return ex_to_tensors(x,feature_size)

 dataset = (tf.data.TFRecordDataset(filenames)

 .map(parseTensors)
 .batch(batch_size)
 )

 return dataset

asked 2 days ago

Patrick McCarthy

1,33111331

A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :

feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]

custom_config = tf.estimator.RunConfig(save_summary_steps=None,
 save_checkpoints_steps=None)

estimator = tf.estimator.LinearClassifier(
 feature_columns = feature_columns,
 model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
 config = custom_config
)

If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.

If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.

class LogisticModel(object):
 def __init__(self):

 self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
 self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))

 def __call__(self, x):

 return tf.sparse_tensor_dense_matmul(x,self.W) + self.B


def grad_fn(model, inputs, targets):

 with tf.GradientTape() as t:
 loss_val = loss_fn(model, inputs, targets)

 return t.gradient(loss_val, [model.W, model.B])


def loss_fn(model, inputs, targets):

 target_size = targets.shape.as_list()[0]
 return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
 logits = model(inputs))) 
 )

def perform_train(model, optim, dataset):

 step = 0
 for x, y in dataset:
 grads = grad_fn(lm, x, y)
 optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
 global_step = tf.train.get_or_create_global_step())

 if step % 20 == 0:
 print(f"Step step: loss_fn(lm,x,y)")
 step += 1

Data generation:

def ex_to_tensors(ex, tensor_size):

 feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
 value_key='values',
 dtype=tf.int64,
 size=tensor_size),
 'label': tf.FixedLenFeature([], tf.int64, default_value=0)
 

 parsed_dict = tf.parse_single_example(ex, feature_spec)

 return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)


def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):

 def parseTensors(x):
 return ex_to_tensors(x,feature_size)

 dataset = (tf.data.TFRecordDataset(filenames)

 .map(parseTensors)
 .batch(batch_size)
 )

 return dataset

python tensorflow sparse-matrix tensorflow-estimator sparse-file

asked 2 days ago

Patrick McCarthy

1,33111331

asked 2 days ago

Patrick McCarthy

1,33111331

asked 2 days ago

Patrick McCarthy

1,33111331

asked 2 days ago

Patrick McCarthy

1,33111331

asked 2 days ago

Patrick McCarthy

1,33111331

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55026429%2ftensorflow-premade-estimator-is-much-slower-than-custom%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggtcf

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

Thal And Out Agency railway station See also References External links Navigation menuOfficial Web Site of Pakistan RailwaysArchivedOfficial Web Site of Pakistan Railwayseeexpanding ite

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Thal And Out Agency railway station See also References External links Navigation menuOfficial Web Site of Pakistan RailwaysArchivedOfficial Web Site of Pakistan Railwayseeexpanding ite