Store the most recent N MBs of a stream?

up vote
3
down vote

favorite

I currently use wget or curl to download a continuous AAC-stream. I would like to limit what's stored on disk to the most recent N MB. In other words, some kind of size limited FIFO-buffer (I guess?). Any ideas how to accomplish this? This is OS X/BSD.

The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.

Update: an alternative solution would be to interrupt every N MB and start a new local file and rotate out the previous file (that is, rename it with a sequence number, time stamp or similar). However, if doing this there need to be a substantial overlap between the files.

edited Oct 23 '17 at 14:54

asked Oct 22 '17 at 18:56

d-b

82676

1

I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer>. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
â€“Â drl
Oct 30 '17 at 2:43

How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
â€“Â d-b
Oct 30 '17 at 6:24

Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer> , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
â€“Â drl
Oct 30 '17 at 12:05

Boost: <theboostcpplibraries.com/boost.circularbuffer
â€“Â drl
Oct 30 '17 at 12:11

add a commentÂ |Â

up vote
3
down vote

favorite

The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.

edited Oct 23 '17 at 14:54

asked Oct 22 '17 at 18:56

d-b

82676

1

I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer>. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
â€“Â drl
Oct 30 '17 at 2:43

How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
â€“Â d-b
Oct 30 '17 at 6:24

Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer> , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
â€“Â drl
Oct 30 '17 at 12:05

Boost: <theboostcpplibraries.com/boost.circularbuffer
â€“Â drl
Oct 30 '17 at 12:11

add a commentÂ |Â

up vote
3
down vote

favorite

The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.

edited Oct 23 '17 at 14:54

asked Oct 22 '17 at 18:56

d-b

82676

The purpose is to be able to stop the stream when something interesting has happened and then extract the last few minutes from it to a permanent storage.

edited Oct 23 '17 at 14:54

asked Oct 22 '17 at 18:56

d-b

82676

edited Oct 23 '17 at 14:54

asked Oct 22 '17 at 18:56

d-b

82676

asked Oct 22 '17 at 18:56

d-b

82676

asked Oct 22 '17 at 18:56

d-b

82676

1

I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer>. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
â€“Â drl
Oct 30 '17 at 2:43

How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
â€“Â d-b
Oct 30 '17 at 6:24

Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer> , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
â€“Â drl
Oct 30 '17 at 12:05

Boost: <theboostcpplibraries.com/boost.circularbuffer
â€“Â drl
Oct 30 '17 at 12:11

add a commentÂ |Â

1

I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer>. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
â€“Â drl
Oct 30 '17 at 2:43

How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
â€“Â d-b
Oct 30 '17 at 6:24

Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer> , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
â€“Â drl
Oct 30 '17 at 12:05

Boost: <theboostcpplibraries.com/boost.circularbuffer
â€“Â drl
Oct 30 '17 at 12:11

I would think in terms of a circular buffer <en.wikipedia.org/wiki/Circular_buffer>. That would keep everything in memory, constantly over-writing to the capacity of the memory allocated. When you saw something interesting (how?), the buffer could be written to a file if that's where you would do some processing. The upshot is that the slow disk is never really touched until it would be needed. Somewhat blue-skying here, but circular buffers were used extensively in CD 6600 OSs, which I used for a number of years.
â€“Â drl
Oct 30 '17 at 2:43

How? I listen to the stream. Thanks for your suggestion, any ideas how to implement it?
â€“Â d-b
Oct 30 '17 at 6:24

Let's call this chunk of software cb. One would need to be able to interrupt the wget -> cb connection, so cb would need to be able to process a signal from the keyboard, cb then dumps the buffer to the disk, and you then use whatever tools to look at the disk file. The circular buffer is managed by 4 pointers: First, Limit, In, Out. Two processes can run simultaneously. If you do c++, the Boost libraries have a module <theboostcpplibraries.com/boost.circularbuffer> , if you do perl, there is some advice at <perlmonks.org?node_id=894330 >
â€“Â drl
Oct 30 '17 at 12:05

Boost: <theboostcpplibraries.com/boost.circularbuffer
â€“Â drl
Oct 30 '17 at 12:11

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
1
down vote

Here's a simple python script that may help. It just reads from stdin and writes to stdout, but keeps the last N bytes in memory. If you interrupt it with control-C (SIGINT) it dumps the memory into file /tmp/sample001 and continues.

#!/usr/bin/python3
# circular buffer in memory without recopy using bytearray
# https://unix.stackexchange.com/a/401875/119298
import sys, os

def copy():
 def dump(fd,v):
 fd.write(v)

 space = 10000000
 buffer = bytearray(space) # all zero bytes
 view = memoryview(buffer)
 head = 0; wrapped = False
 sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
 sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
 fileno = 1
 while True:
 try:
 nbytes = sys.stdin.readinto(view[head:])
 if nbytes==0: 
 break # eof
 sys.stdout.write(view[head:head+nbytes].tobytes())
 head += nbytes
 if head>=space:
 head = 0; wrapped = True
 except KeyboardInterrupt:
 filename = "/tmp/sample%03d" % fileno
 fileno += 1
 with open(filename,"wb") as fd:
 if wrapped:
 dump(fd, view[head:])
 if head:
 dump(fd, view[0:head])

copy()

If you don't have python3 it will need a few changes for python 2.7.
You might need to worry about how to preserve a legal AAC framing format, but perhaps if you try first you may find whatever you are using manages to self-sync from arbitrary offset data.

answered Nov 1 '17 at 17:36

meuh

29.7k11751

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f399769%2fstore-the-most-recent-n-mbs-of-a-stream%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

#!/usr/bin/python3
# circular buffer in memory without recopy using bytearray
# https://unix.stackexchange.com/a/401875/119298
import sys, os

def copy():
 def dump(fd,v):
 fd.write(v)

 space = 10000000
 buffer = bytearray(space) # all zero bytes
 view = memoryview(buffer)
 head = 0; wrapped = False
 sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
 sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
 fileno = 1
 while True:
 try:
 nbytes = sys.stdin.readinto(view[head:])
 if nbytes==0: 
 break # eof
 sys.stdout.write(view[head:head+nbytes].tobytes())
 head += nbytes
 if head>=space:
 head = 0; wrapped = True
 except KeyboardInterrupt:
 filename = "/tmp/sample%03d" % fileno
 fileno += 1
 with open(filename,"wb") as fd:
 if wrapped:
 dump(fd, view[head:])
 if head:
 dump(fd, view[0:head])

copy()

answered Nov 1 '17 at 17:36

meuh

29.7k11751

add a commentÂ |Â

up vote
1
down vote

#!/usr/bin/python3
# circular buffer in memory without recopy using bytearray
# https://unix.stackexchange.com/a/401875/119298
import sys, os

def copy():
 def dump(fd,v):
 fd.write(v)

 space = 10000000
 buffer = bytearray(space) # all zero bytes
 view = memoryview(buffer)
 head = 0; wrapped = False
 sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
 sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
 fileno = 1
 while True:
 try:
 nbytes = sys.stdin.readinto(view[head:])
 if nbytes==0: 
 break # eof
 sys.stdout.write(view[head:head+nbytes].tobytes())
 head += nbytes
 if head>=space:
 head = 0; wrapped = True
 except KeyboardInterrupt:
 filename = "/tmp/sample%03d" % fileno
 fileno += 1
 with open(filename,"wb") as fd:
 if wrapped:
 dump(fd, view[head:])
 if head:
 dump(fd, view[0:head])

copy()

answered Nov 1 '17 at 17:36

meuh

29.7k11751

add a commentÂ |Â

up vote
1
down vote

#!/usr/bin/python3
# circular buffer in memory without recopy using bytearray
# https://unix.stackexchange.com/a/401875/119298
import sys, os

def copy():
 def dump(fd,v):
 fd.write(v)

 space = 10000000
 buffer = bytearray(space) # all zero bytes
 view = memoryview(buffer)
 head = 0; wrapped = False
 sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
 sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
 fileno = 1
 while True:
 try:
 nbytes = sys.stdin.readinto(view[head:])
 if nbytes==0: 
 break # eof
 sys.stdout.write(view[head:head+nbytes].tobytes())
 head += nbytes
 if head>=space:
 head = 0; wrapped = True
 except KeyboardInterrupt:
 filename = "/tmp/sample%03d" % fileno
 fileno += 1
 with open(filename,"wb") as fd:
 if wrapped:
 dump(fd, view[head:])
 if head:
 dump(fd, view[0:head])

copy()

answered Nov 1 '17 at 17:36

meuh

29.7k11751

#!/usr/bin/python3
# circular buffer in memory without recopy using bytearray
# https://unix.stackexchange.com/a/401875/119298
import sys, os

def copy():
 def dump(fd,v):
 fd.write(v)

 space = 10000000
 buffer = bytearray(space) # all zero bytes
 view = memoryview(buffer)
 head = 0; wrapped = False
 sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
 sys.stdout = os.fdopen(sys.stdout.fileno(), 'wb', 0)
 fileno = 1
 while True:
 try:
 nbytes = sys.stdin.readinto(view[head:])
 if nbytes==0: 
 break # eof
 sys.stdout.write(view[head:head+nbytes].tobytes())
 head += nbytes
 if head>=space:
 head = 0; wrapped = True
 except KeyboardInterrupt:
 filename = "/tmp/sample%03d" % fileno
 fileno += 1
 with open(filename,"wb") as fd:
 if wrapped:
 dump(fd, view[head:])
 if head:
 dump(fd, view[0:head])

copy()

answered Nov 1 '17 at 17:36

meuh

29.7k11751

answered Nov 1 '17 at 17:36

meuh

29.7k11751

answered Nov 1 '17 at 17:36

meuh

29.7k11751

answered Nov 1 '17 at 17:36

meuh

29.7k11751

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu