Use internal representation of autoencoder for anomaly detection
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks
keras neural-network artificial-intelligence autoencoder anomaly-detection
add a comment |
I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks
keras neural-network artificial-intelligence autoencoder anomaly-detection
3
I'm voting to close this question as off-topic because it is not a programming question.
– Matias Valdenegro
Nov 24 '18 at 23:29
add a comment |
I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks
keras neural-network artificial-intelligence autoencoder anomaly-detection
I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks
keras neural-network artificial-intelligence autoencoder anomaly-detection
keras neural-network artificial-intelligence autoencoder anomaly-detection
asked Nov 24 '18 at 20:28
Andrea GuidiAndrea Guidi
21
21
3
I'm voting to close this question as off-topic because it is not a programming question.
– Matias Valdenegro
Nov 24 '18 at 23:29
add a comment |
3
I'm voting to close this question as off-topic because it is not a programming question.
– Matias Valdenegro
Nov 24 '18 at 23:29
3
3
I'm voting to close this question as off-topic because it is not a programming question.
– Matias Valdenegro
Nov 24 '18 at 23:29
I'm voting to close this question as off-topic because it is not a programming question.
– Matias Valdenegro
Nov 24 '18 at 23:29
add a comment |
1 Answer
1
active
oldest
votes
I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.
First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.
You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462087%2fuse-internal-representation-of-autoencoder-for-anomaly-detection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.
First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.
You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.
add a comment |
I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.
First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.
You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.
add a comment |
I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.
First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.
You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.
I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.
First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.
You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.
answered Nov 27 '18 at 21:57
BaconBacon
1,03711335
1,03711335
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462087%2fuse-internal-representation-of-autoencoder-for-anomaly-detection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
I'm voting to close this question as off-topic because it is not a programming question.
– Matias Valdenegro
Nov 24 '18 at 23:29