- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I have mobilenet's like graph where the end operation is concatV2.
All layers before concat give results without NANs but the final concat layers give the NANs:
https://drive.google.com/open?id=1WydK5qRbfTTK8dUDBN5qfftP72P2sQWi
mvNCCheck graph_optim.pb -in=image -on TfPoseEstimator/feat_concat
Result: (28, 28, 864)
1) 677375 nan
2) 461212 nan
3) 461240 nan
4) 461239 nan
5) 461238 nan
Expected: (1, 28, 28, 864)
1) 676897 32.22853
2) 653518 32.084774
3) 653723 32.062202
4) 653946 31.285995
5) 654040 29.77494
node {
name: "TfPoseEstimator/feat_concat"
op: "ConcatV2"
input: "TfPoseEstimator/Conv2d_3_pool"
input: "TfPoseEstimator/MobilenetV1/Conv2d_7_pointwise/Relu"
input: "TfPoseEstimator/MobilenetV1/Conv2d_11_pointwise/Relu"
input: "TfPoseEstimator/feat_concat/axis"
attr {
key: "N"
value {
i: 3
}
}
attr {
key: "T"
value {
type: DT_FLOAT
}
}
attr {
key: "Tidx"
value {
type: DT_INT32
}
}
}
I've checked all inputs
input: "TfPoseEstimator/Conv2d_3_pool"
input: "TfPoseEstimator/MobilenetV1/Conv2d_7_pointwise/Relu"
input: "TfPoseEstimator/MobilenetV1/Conv2d_11_pointwise/Relu"
and they give OK results (as an example for TfPoseEstimator/Conv2d_3_pool):
mvNCCheck /home/fast/openpose/tf-pose-estimation/chk2/graph_optim.pb -in=image -on TfPoseEstimator/Conv2d_3_pool
Result: (28, 28, 96)
1) 72639 15.2
2) 72653 14.516
3) 72659 13.28
4) 73311 12.97
5) 73695 12.97
Expected: (1, 28, 28, 96)
1) 72639 15.215018
2) 72653 14.516435
3) 72659 13.285161
4) 74559 12.965318
5) 73695 12.963984
(inputs also pass every test)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
deleted
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I add proxy (for example max pool with kernel=1) after 'TfPoseEstimator/MobilenetV1/Conv2d_7_pointwise/Relu:0' and replace original node in concat with this proxy, it becomes OK!
conc_in_1 = graph.get_tensor_by_name('TfPoseEstimator/Conv2d_3_pool:0')
conc_in_2 = graph.get_tensor_by_name('TfPoseEstimator/MobilenetV1/Conv2d_7_pointwise/Relu:0')
conc_in_3 = graph.get_tensor_by_name('TfPoseEstimator/MobilenetV1/Conv2d_11_pointwise/Relu:0')
proxy= tf.nn.max_pool(conc_in_2,ksize=[1, 1, 1, 1],strides=[1, 1, 1, 1], padding='SAME', name='proxy')
test_conc1 = tf.concat([conc_in_1, proxy, conc_in_3], 3, name='test_conc')
Result: (28, 28, 864)
1) 676897 32.22
2) 653723 32.16
3) 653946 31.42
4) 654040 29.88
5) 653920 29.81
Expected: (1, 28, 28, 864)
1) 676897 32.2097
2) 653723 32.097595
3) 653518 32.080677
4) 653946 31.31227
5) 654040 29.789398
So obviously it's very insulting bug that can be fixed!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@jokilokis Thanks for pointing this out. The bug has to do with the last layer being a concat layer. We're working on a release candidate for this issue at the moment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
wow, thanks @Tome_at_Intel, does it mean I need to add fake layer to be the last layer?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to know that when can this bug be fixed?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@songyoff lets try my approach with fake pooling after some input node to concat.
TfPoseEstimator/feat_concat/Proxy 0.7 651.8 1.985
as I can see pooling s very fast operation on stick but it helps to avoid some compiler optimizations that lead to bugs
proto representation as an example what I mean:
node {
name: "TfPoseEstimator/MobilenetV1/Conv2d_7_pointwise/Relu/Proxy"
op: "MaxPool"
input: "TfPoseEstimator/MobilenetV1/Conv2d_7_pointwise/Relu"
attr {
key: "T"
value {
type: DT_FLOAT
}
}
attr {
key: "data_format"
value {
s: "NHWC"
}
}
attr {
key: "ksize"
value {
list {
i: 1
i: 1
i: 1
i: 1
}
}
}
attr {
key: "padding"
value {
s: "SAME"
}
}
attr {
key: "strides"
value {
list {
i: 1
i: 1
i: 1
i: 1
}
}
}
}
node {
name: "TfPoseEstimator/feat_concat"
op: "ConcatV2"
input: "TfPoseEstimator/Conv2d_3_pool"
input: "TfPoseEstimator/MobilenetV1/Conv2d_7_pointwise/Relu/Proxy"
input: "TfPoseEstimator/MobilenetV1/Conv2d_11_pointwise/Relu"
input: "TfPoseEstimator/feat_concat/axis"
attr {
key: "N"
value {
i: 3
}
}
attr {
key: "T"
value {
type: DT_FLOAT
}
}
attr {
key: "Tidx"
value {
type: DT_INT32
}
}
}
Hope it helps while the fix is coming
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@jokilokis thanks a lot, it's a good trick before the bug is fixed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@jokilokis , I have met the same problem with you. Adding a proxy can get normal output of 'feat_concat', but the latter layer still got nan. Have you successfully run the pose network on the stick?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page