Store the most recent N MBs of a stream?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite












I currently use wget or curl to download a continuous AAC-stream. I would like to limit what's stored on disk to the most recent N MB. In other words, some kind of size limited FIFO-buffer (I guess?). Any ideas how to accomplish this? This is OS X/BSD.



The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.



Update: an alternative solution would be to interrupt every N MB and start a new local file and rotate out the previous file (that is, rename it with a sequence number, time stamp or similar). However, if doing this there need to be a substantial overlap between the files.







share|improve this question


















  • 1




    I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer&gt;. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
    – drl
    Oct 30 '17 at 2:43











  • How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
    – d-b
    Oct 30 '17 at 6:24










  • Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer&gt; , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
    – drl
    Oct 30 '17 at 12:05











  • Boost: <theboostcpplibraries.com/boost.circularbuffer
    – drl
    Oct 30 '17 at 12:11















up vote
3
down vote

favorite












I currently use wget or curl to download a continuous AAC-stream. I would like to limit what's stored on disk to the most recent N MB. In other words, some kind of size limited FIFO-buffer (I guess?). Any ideas how to accomplish this? This is OS X/BSD.



The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.



Update: an alternative solution would be to interrupt every N MB and start a new local file and rotate out the previous file (that is, rename it with a sequence number, time stamp or similar). However, if doing this there need to be a substantial overlap between the files.







share|improve this question


















  • 1




    I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer&gt;. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
    – drl
    Oct 30 '17 at 2:43











  • How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
    – d-b
    Oct 30 '17 at 6:24










  • Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer&gt; , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
    – drl
    Oct 30 '17 at 12:05











  • Boost: <theboostcpplibraries.com/boost.circularbuffer
    – drl
    Oct 30 '17 at 12:11













up vote
3
down vote

favorite









up vote
3
down vote

favorite











I currently use wget or curl to download a continuous AAC-stream. I would like to limit what's stored on disk to the most recent N MB. In other words, some kind of size limited FIFO-buffer (I guess?). Any ideas how to accomplish this? This is OS X/BSD.



The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.



Update: an alternative solution would be to interrupt every N MB and start a new local file and rotate out the previous file (that is, rename it with a sequence number, time stamp or similar). However, if doing this there need to be a substantial overlap between the files.







share|improve this question














I currently use wget or curl to download a continuous AAC-stream. I would like to limit what's stored on disk to the most recent N MB. In other words, some kind of size limited FIFO-buffer (I guess?). Any ideas how to accomplish this? This is OS X/BSD.



The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.



Update: an alternative solution would be to interrupt every N MB and start a new local file and rotate out the previous file (that is, rename it with a sequence number, time stamp or similar). However, if doing this there need to be a substantial overlap between the files.









share|improve this question













share|improve this question




share|improve this question








edited Oct 23 '17 at 14:54

























asked Oct 22 '17 at 18:56









d-b

82676




82676







  • 1




    I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer&gt;. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
    – drl
    Oct 30 '17 at 2:43











  • How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
    – d-b
    Oct 30 '17 at 6:24










  • Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer&gt; , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
    – drl
    Oct 30 '17 at 12:05











  • Boost: <theboostcpplibraries.com/boost.circularbuffer
    – drl
    Oct 30 '17 at 12:11













  • 1




    I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer&gt;. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
    – drl
    Oct 30 '17 at 2:43











  • How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
    – d-b
    Oct 30 '17 at 6:24










  • Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer&gt; , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
    – drl
    Oct 30 '17 at 12:05











  • Boost: <theboostcpplibraries.com/boost.circularbuffer
    – drl
    Oct 30 '17 at 12:11








1




1




I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer&gt;. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
– drl
Oct 30 '17 at 2:43





I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer&gt;. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
– drl
Oct 30 '17 at 2:43













How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
– d-b
Oct 30 '17 at 6:24




How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
– d-b
Oct 30 '17 at 6:24












Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer&gt; , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
– drl
Oct 30 '17 at 12:05





Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer&gt; , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
– drl
Oct 30 '17 at 12:05













Boost: <theboostcpplibraries.com/boost.circularbuffer
– drl
Oct 30 '17 at 12:11





Boost: <theboostcpplibraries.com/boost.circularbuffer
– drl
Oct 30 '17 at 12:11











1 Answer
1






active

oldest

votes

















up vote
1
down vote













Here's a simple python script that may help. It just reads from stdin and writes to stdout, but keeps the last N bytes in memory. If you interrupt it with control-C (SIGINT) it dumps the memory into file /tmp/sample001 and continues.



#!/usr/bin/python3
# circular buffer in memory without recopy using bytearray
# https://unix.stackexchange.com/a/401875/119298
import sys, os

def copy():
def dump(fd,v):
fd.write(v)

space = 10000000
buffer = bytearray(space) # all zero bytes
view = memoryview(buffer)
head = 0; wrapped = False
sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
fileno = 1
while True:
try:
nbytes = sys.stdin.readinto(view[head:])
if nbytes==0:
break # eof
sys.stdout.write(view[head:head+nbytes].tobytes())
head += nbytes
if head>=space:
head = 0; wrapped = True
except KeyboardInterrupt:
filename = "/tmp/sample%03d" % fileno
fileno += 1
with open(filename,"wb") as fd:
if wrapped:
dump(fd, view[head:])
if head:
dump(fd, view[0:head])

copy()


If you don't have python3 it will need a few changes for python 2.7.
You might need to worry about how to preserve a legal AAC framing format, but perhaps if you try first you may find whatever you are using manages to self-sync from arbitrary offset data.






share|improve this answer




















    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f399769%2fstore-the-most-recent-n-mbs-of-a-stream%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    Here's a simple python script that may help. It just reads from stdin and writes to stdout, but keeps the last N bytes in memory. If you interrupt it with control-C (SIGINT) it dumps the memory into file /tmp/sample001 and continues.



    #!/usr/bin/python3
    # circular buffer in memory without recopy using bytearray
    # https://unix.stackexchange.com/a/401875/119298
    import sys, os

    def copy():
    def dump(fd,v):
    fd.write(v)

    space = 10000000
    buffer = bytearray(space) # all zero bytes
    view = memoryview(buffer)
    head = 0; wrapped = False
    sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
    sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
    fileno = 1
    while True:
    try:
    nbytes = sys.stdin.readinto(view[head:])
    if nbytes==0:
    break # eof
    sys.stdout.write(view[head:head+nbytes].tobytes())
    head += nbytes
    if head>=space:
    head = 0; wrapped = True
    except KeyboardInterrupt:
    filename = "/tmp/sample%03d" % fileno
    fileno += 1
    with open(filename,"wb") as fd:
    if wrapped:
    dump(fd, view[head:])
    if head:
    dump(fd, view[0:head])

    copy()


    If you don't have python3 it will need a few changes for python 2.7.
    You might need to worry about how to preserve a legal AAC framing format, but perhaps if you try first you may find whatever you are using manages to self-sync from arbitrary offset data.






    share|improve this answer
























      up vote
      1
      down vote













      Here's a simple python script that may help. It just reads from stdin and writes to stdout, but keeps the last N bytes in memory. If you interrupt it with control-C (SIGINT) it dumps the memory into file /tmp/sample001 and continues.



      #!/usr/bin/python3
      # circular buffer in memory without recopy using bytearray
      # https://unix.stackexchange.com/a/401875/119298
      import sys, os

      def copy():
      def dump(fd,v):
      fd.write(v)

      space = 10000000
      buffer = bytearray(space) # all zero bytes
      view = memoryview(buffer)
      head = 0; wrapped = False
      sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
      sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
      fileno = 1
      while True:
      try:
      nbytes = sys.stdin.readinto(view[head:])
      if nbytes==0:
      break # eof
      sys.stdout.write(view[head:head+nbytes].tobytes())
      head += nbytes
      if head>=space:
      head = 0; wrapped = True
      except KeyboardInterrupt:
      filename = "/tmp/sample%03d" % fileno
      fileno += 1
      with open(filename,"wb") as fd:
      if wrapped:
      dump(fd, view[head:])
      if head:
      dump(fd, view[0:head])

      copy()


      If you don't have python3 it will need a few changes for python 2.7.
      You might need to worry about how to preserve a legal AAC framing format, but perhaps if you try first you may find whatever you are using manages to self-sync from arbitrary offset data.






      share|improve this answer






















        up vote
        1
        down vote










        up vote
        1
        down vote









        Here's a simple python script that may help. It just reads from stdin and writes to stdout, but keeps the last N bytes in memory. If you interrupt it with control-C (SIGINT) it dumps the memory into file /tmp/sample001 and continues.



        #!/usr/bin/python3
        # circular buffer in memory without recopy using bytearray
        # https://unix.stackexchange.com/a/401875/119298
        import sys, os

        def copy():
        def dump(fd,v):
        fd.write(v)

        space = 10000000
        buffer = bytearray(space) # all zero bytes
        view = memoryview(buffer)
        head = 0; wrapped = False
        sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
        sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
        fileno = 1
        while True:
        try:
        nbytes = sys.stdin.readinto(view[head:])
        if nbytes==0:
        break # eof
        sys.stdout.write(view[head:head+nbytes].tobytes())
        head += nbytes
        if head>=space:
        head = 0; wrapped = True
        except KeyboardInterrupt:
        filename = "/tmp/sample%03d" % fileno
        fileno += 1
        with open(filename,"wb") as fd:
        if wrapped:
        dump(fd, view[head:])
        if head:
        dump(fd, view[0:head])

        copy()


        If you don't have python3 it will need a few changes for python 2.7.
        You might need to worry about how to preserve a legal AAC framing format, but perhaps if you try first you may find whatever you are using manages to self-sync from arbitrary offset data.






        share|improve this answer












        Here's a simple python script that may help. It just reads from stdin and writes to stdout, but keeps the last N bytes in memory. If you interrupt it with control-C (SIGINT) it dumps the memory into file /tmp/sample001 and continues.



        #!/usr/bin/python3
        # circular buffer in memory without recopy using bytearray
        # https://unix.stackexchange.com/a/401875/119298
        import sys, os

        def copy():
        def dump(fd,v):
        fd.write(v)

        space = 10000000
        buffer = bytearray(space) # all zero bytes
        view = memoryview(buffer)
        head = 0; wrapped = False
        sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
        sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
        fileno = 1
        while True:
        try:
        nbytes = sys.stdin.readinto(view[head:])
        if nbytes==0:
        break # eof
        sys.stdout.write(view[head:head+nbytes].tobytes())
        head += nbytes
        if head>=space:
        head = 0; wrapped = True
        except KeyboardInterrupt:
        filename = "/tmp/sample%03d" % fileno
        fileno += 1
        with open(filename,"wb") as fd:
        if wrapped:
        dump(fd, view[head:])
        if head:
        dump(fd, view[0:head])

        copy()


        If you don't have python3 it will need a few changes for python 2.7.
        You might need to worry about how to preserve a legal AAC framing format, but perhaps if you try first you may find whatever you are using manages to self-sync from arbitrary offset data.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 1 '17 at 17:36









        meuh

        29.7k11751




        29.7k11751



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f399769%2fstore-the-most-recent-n-mbs-of-a-stream%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Peggy Mitchell

            Palaiologos

            The Forum (Inglewood, California)