Use internal representation of autoencoder for anomaly detection





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







-1















I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks










share|improve this question


















  • 3





    I'm voting to close this question as off-topic because it is not a programming question.

    – Matias Valdenegro
    Nov 24 '18 at 23:29


















-1















I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks










share|improve this question


















  • 3





    I'm voting to close this question as off-topic because it is not a programming question.

    – Matias Valdenegro
    Nov 24 '18 at 23:29














-1












-1








-1








I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks










share|improve this question














I've trained an autoencoder to recognize 'positive' time series (the network is a simple fully connected network, no recurrent layers). The problem is that from what my advisor says, I should try to detect anomalies using some statistics on the latent space (like difference between histograms of latent space between good and outlier data), but when I predict time series with outliers I get the same internal representation as with the good data. I believe this is due to the fact that my network can only reproduce the normal data.
Do you have any hints?
Thanks







keras neural-network artificial-intelligence autoencoder anomaly-detection






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 24 '18 at 20:28









Andrea GuidiAndrea Guidi

21




21








  • 3





    I'm voting to close this question as off-topic because it is not a programming question.

    – Matias Valdenegro
    Nov 24 '18 at 23:29














  • 3





    I'm voting to close this question as off-topic because it is not a programming question.

    – Matias Valdenegro
    Nov 24 '18 at 23:29








3




3





I'm voting to close this question as off-topic because it is not a programming question.

– Matias Valdenegro
Nov 24 '18 at 23:29





I'm voting to close this question as off-topic because it is not a programming question.

– Matias Valdenegro
Nov 24 '18 at 23:29












1 Answer
1






active

oldest

votes


















0














I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.



First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.



You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.






share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462087%2fuse-internal-representation-of-autoencoder-for-anomaly-detection%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.



    First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.



    You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.






    share|improve this answer




























      0














      I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.



      First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.



      You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.






      share|improve this answer


























        0












        0








        0







        I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.



        First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.



        You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.






        share|improve this answer













        I guess you are trying to detect anomaly by using the reconstruction error of your network, i.e. train on some normal timeseries, then detect on a timeseries which includes outliers. If I am further guessing right, your advisor is suggesting that the internal representation will tell you more about the nature of the anomaly, i.e. which features of the input data is most abnormal.



        First thing to note, is that the intermediate features your network is building with only legitimate data will not change in detection mode, thus the internal representation (the weights of each of your intermediate layer's neurons) will not change when processing a new datapoint in this mode.



        You will only be able to reason about the root cause of the outlier (which dimension contributes the most to the reconstruction error) if you have a good idea of which original features your intermediate features represent. This can be quite hard if you have one fully connected auto-encoder and several hidden layers, where the contribution of each feature to each neuron is increasingly interleaved with the other features. A trick is to build one auto-encoder per set of features, and use them as an ensemble for anomaly prediction. That way, each auto-encoder in the ensemble is known to be responsible for a set of features, and it makes it easier to known how each set of features contribute to the anomaly. See an example here.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 27 '18 at 21:57









        BaconBacon

        1,03711335




        1,03711335
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462087%2fuse-internal-representation-of-autoencoder-for-anomaly-detection%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            Academy of Television Arts & Sciences

            L'Équipe

            1995 France bombings