Tensorflow premade estimator is much slower than custom?2019 Community Moderator ElectionWhy is reading lines from stdin much slower in C++ than Python?Tensorflow: Input pipeline with sparse data for the SVM estimatorHow to get train loss and evaluate loss every global step in Tensorflow Estimator?Tensorflow custom Estimator with Dataset API: embedding lookup (feature_column) NMT taskloading a tensorflow Estimator export_savedmodel() and predicting on tfrecord datasetUsing Tensorflow Estimator API with Images for SemSegTensorflow Estimator: loss not decreasing when using tf.feature_column.embedding_column for a list of categorical variablesStopping criteria for pre-made estimators in TensorFlowtensorflow estimator training only runs half of the stepsTraining Estimators less than one epoch using dataset API?
Should I file my taxes? No income, unemployed, but paid 2k in student loan interest
The (Easy) Road to Code
Tabular environment - text vertically positions itself by bottom of tikz picture in adjacent cell
Professor forcing me to attend a conference, I can't afford even with 50% funding
How can I portion out frozen cookie dough?
What is the orbit and expected lifetime of Crew Dragon trunk?
Who has more? Ireland or Iceland?
What is the best index strategy or query SELECT when performing a search/lookup BETWEEN IP address (IPv4 and IPv6) ranges?
Paper published similar to PhD thesis
Boss Telling direct supervisor I snitched
Can I challenge the interviewer to give me a proper technical feedback?
What is better: yes / no radio, or simple checkbox?
What exactly is the meaning of "fine wine"?
Vector-transposing function
Use Mercury as quenching liquid for swords?
Should we avoid writing fiction about historical events without extensive research?
Having the player face themselves after the mid-game
Can Witch Sight see through Mirror Image?
ESPP--any reason not to go all in?
Short story about cities being connected by a conveyor belt
Why do phishing e-mails use faked e-mail addresses instead of the real one?
“I had a flat in the centre of town, but I didn’t like living there, so …”
How does learning spells work when leveling a multiclass character?
What is Tony Stark injecting into himself in Iron Man 3?
Tensorflow premade estimator is much slower than custom?
2019 Community Moderator ElectionWhy is reading lines from stdin much slower in C++ than Python?Tensorflow: Input pipeline with sparse data for the SVM estimatorHow to get train loss and evaluate loss every global step in Tensorflow Estimator?Tensorflow custom Estimator with Dataset API: embedding lookup (feature_column) NMT taskloading a tensorflow Estimator export_savedmodel() and predicting on tfrecord datasetUsing Tensorflow Estimator API with Images for SemSegTensorflow Estimator: loss not decreasing when using tf.feature_column.embedding_column for a list of categorical variablesStopping criteria for pre-made estimators in TensorFlowtensorflow estimator training only runs half of the stepsTraining Estimators less than one epoch using dataset API?
I'm benchmarking general TF operations, and so to establish a baseline I'm trying to figure out how quickly I can train a simple logistic regression with a single pass of the training data. My input is a TFRecord file containing 860,000 sparse rows, with 164,000 one-hot encoded features. Data processing at bottom.
A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :
feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]
custom_config = tf.estimator.RunConfig(save_summary_steps=None,
save_checkpoints_steps=None)
estimator = tf.estimator.LinearClassifier(
feature_columns = feature_columns,
model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
config = custom_config
)
If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.
If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.
class LogisticModel(object):
def __init__(self):
self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))
def __call__(self, x):
return tf.sparse_tensor_dense_matmul(x,self.W) + self.B
def grad_fn(model, inputs, targets):
with tf.GradientTape() as t:
loss_val = loss_fn(model, inputs, targets)
return t.gradient(loss_val, [model.W, model.B])
def loss_fn(model, inputs, targets):
target_size = targets.shape.as_list()[0]
return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
logits = model(inputs)))
)
def perform_train(model, optim, dataset):
step = 0
for x, y in dataset:
grads = grad_fn(lm, x, y)
optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
global_step = tf.train.get_or_create_global_step())
if step % 20 == 0:
print(f"Step step: loss_fn(lm,x,y)")
step += 1
1) What could account for this huge speed difference? Is overhead in the estimators that significant?
2) Without reading into memory, can my custom function be improved further? An in-house C++-based library is still an order of magnitude faster.
Data generation:
def ex_to_tensors(ex, tensor_size):
feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
value_key='values',
dtype=tf.int64,
size=tensor_size),
'label': tf.FixedLenFeature([], tf.int64, default_value=0)
parsed_dict = tf.parse_single_example(ex, feature_spec)
return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)
def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):
def parseTensors(x):
return ex_to_tensors(x,feature_size)
dataset = (tf.data.TFRecordDataset(filenames)
.map(parseTensors)
.batch(batch_size)
)
return dataset
python tensorflow sparse-matrix tensorflow-estimator sparse-file
add a comment |
I'm benchmarking general TF operations, and so to establish a baseline I'm trying to figure out how quickly I can train a simple logistic regression with a single pass of the training data. My input is a TFRecord file containing 860,000 sparse rows, with 164,000 one-hot encoded features. Data processing at bottom.
A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :
feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]
custom_config = tf.estimator.RunConfig(save_summary_steps=None,
save_checkpoints_steps=None)
estimator = tf.estimator.LinearClassifier(
feature_columns = feature_columns,
model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
config = custom_config
)
If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.
If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.
class LogisticModel(object):
def __init__(self):
self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))
def __call__(self, x):
return tf.sparse_tensor_dense_matmul(x,self.W) + self.B
def grad_fn(model, inputs, targets):
with tf.GradientTape() as t:
loss_val = loss_fn(model, inputs, targets)
return t.gradient(loss_val, [model.W, model.B])
def loss_fn(model, inputs, targets):
target_size = targets.shape.as_list()[0]
return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
logits = model(inputs)))
)
def perform_train(model, optim, dataset):
step = 0
for x, y in dataset:
grads = grad_fn(lm, x, y)
optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
global_step = tf.train.get_or_create_global_step())
if step % 20 == 0:
print(f"Step step: loss_fn(lm,x,y)")
step += 1
1) What could account for this huge speed difference? Is overhead in the estimators that significant?
2) Without reading into memory, can my custom function be improved further? An in-house C++-based library is still an order of magnitude faster.
Data generation:
def ex_to_tensors(ex, tensor_size):
feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
value_key='values',
dtype=tf.int64,
size=tensor_size),
'label': tf.FixedLenFeature([], tf.int64, default_value=0)
parsed_dict = tf.parse_single_example(ex, feature_spec)
return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)
def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):
def parseTensors(x):
return ex_to_tensors(x,feature_size)
dataset = (tf.data.TFRecordDataset(filenames)
.map(parseTensors)
.batch(batch_size)
)
return dataset
python tensorflow sparse-matrix tensorflow-estimator sparse-file
add a comment |
I'm benchmarking general TF operations, and so to establish a baseline I'm trying to figure out how quickly I can train a simple logistic regression with a single pass of the training data. My input is a TFRecord file containing 860,000 sparse rows, with 164,000 one-hot encoded features. Data processing at bottom.
A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :
feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]
custom_config = tf.estimator.RunConfig(save_summary_steps=None,
save_checkpoints_steps=None)
estimator = tf.estimator.LinearClassifier(
feature_columns = feature_columns,
model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
config = custom_config
)
If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.
If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.
class LogisticModel(object):
def __init__(self):
self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))
def __call__(self, x):
return tf.sparse_tensor_dense_matmul(x,self.W) + self.B
def grad_fn(model, inputs, targets):
with tf.GradientTape() as t:
loss_val = loss_fn(model, inputs, targets)
return t.gradient(loss_val, [model.W, model.B])
def loss_fn(model, inputs, targets):
target_size = targets.shape.as_list()[0]
return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
logits = model(inputs)))
)
def perform_train(model, optim, dataset):
step = 0
for x, y in dataset:
grads = grad_fn(lm, x, y)
optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
global_step = tf.train.get_or_create_global_step())
if step % 20 == 0:
print(f"Step step: loss_fn(lm,x,y)")
step += 1
1) What could account for this huge speed difference? Is overhead in the estimators that significant?
2) Without reading into memory, can my custom function be improved further? An in-house C++-based library is still an order of magnitude faster.
Data generation:
def ex_to_tensors(ex, tensor_size):
feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
value_key='values',
dtype=tf.int64,
size=tensor_size),
'label': tf.FixedLenFeature([], tf.int64, default_value=0)
parsed_dict = tf.parse_single_example(ex, feature_spec)
return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)
def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):
def parseTensors(x):
return ex_to_tensors(x,feature_size)
dataset = (tf.data.TFRecordDataset(filenames)
.map(parseTensors)
.batch(batch_size)
)
return dataset
python tensorflow sparse-matrix tensorflow-estimator sparse-file
I'm benchmarking general TF operations, and so to establish a baseline I'm trying to figure out how quickly I can train a simple logistic regression with a single pass of the training data. My input is a TFRecord file containing 860,000 sparse rows, with 164,000 one-hot encoded features. Data processing at bottom.
A premade tf.estimator.Estimator, configured like so, can fit one pass of the data in 932 seconds :
feature_columns = [tf.feature_column.numeric_column(key='features',shape=164000)]
custom_config = tf.estimator.RunConfig(save_summary_steps=None,
save_checkpoints_steps=None)
estimator = tf.estimator.LinearClassifier(
feature_columns = feature_columns,
model_dir = os.path.join(MODELDIR,f'PremadeLinearClassifier_currtime()'),
config = custom_config
)
If I read the data into numpy arrays and create a dataset from from_tensor_slices() I can get that down to 467 seconds.
If I build my own training functions, I can perform a single pass of the data in 66.6 seconds, reading from disk.
class LogisticModel(object):
def __init__(self):
self.W = tf.Variable(tf.random_normal([164000,1],mean=0, stddev=0.1))
self.B = tf.Variable(tf.random_normal([],mean=0.0,stddev=0.1))
def __call__(self, x):
return tf.sparse_tensor_dense_matmul(x,self.W) + self.B
def grad_fn(model, inputs, targets):
with tf.GradientTape() as t:
loss_val = loss_fn(model, inputs, targets)
return t.gradient(loss_val, [model.W, model.B])
def loss_fn(model, inputs, targets):
target_size = targets.shape.as_list()[0]
return (tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = tf.reshape(targets,[target_size,1]),
logits = model(inputs)))
)
def perform_train(model, optim, dataset):
step = 0
for x, y in dataset:
grads = grad_fn(lm, x, y)
optimizer.apply_gradients(zip(grads, [lm.W, lm.B]),
global_step = tf.train.get_or_create_global_step())
if step % 20 == 0:
print(f"Step step: loss_fn(lm,x,y)")
step += 1
1) What could account for this huge speed difference? Is overhead in the estimators that significant?
2) Without reading into memory, can my custom function be improved further? An in-house C++-based library is still an order of magnitude faster.
Data generation:
def ex_to_tensors(ex, tensor_size):
feature_spec = 'sparse': tf.SparseFeature(index_key='indices',
value_key='values',
dtype=tf.int64,
size=tensor_size),
'label': tf.FixedLenFeature([], tf.int64, default_value=0)
parsed_dict = tf.parse_single_example(ex, feature_spec)
return tf.cast(parsed_dict['sparse'],tf.float32), tf.cast(parsed_dict['label'],tf.float32)
def ex_input_fn(*filenames,batch_size=1000, feature_size=int(1e6)):
def parseTensors(x):
return ex_to_tensors(x,feature_size)
dataset = (tf.data.TFRecordDataset(filenames)
.map(parseTensors)
.batch(batch_size)
)
return dataset
python tensorflow sparse-matrix tensorflow-estimator sparse-file
python tensorflow sparse-matrix tensorflow-estimator sparse-file
asked 2 days ago
Patrick McCarthyPatrick McCarthy
1,33111331
1,33111331
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55026429%2ftensorflow-premade-estimator-is-much-slower-than-custom%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55026429%2ftensorflow-premade-estimator-is-much-slower-than-custom%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown