Did any machines alternate between two video memory banks?
Clash Royale CLAN TAG#URR8PPP
If a home computer or console has two banks of memory, A and B, then the following design is possible:
The video chip only connects to bank A. The CPU connects to both. During active scan line, CPU access to bank A is slow because it must share bandwidth with the video chip, but it can access bank B at full speed. This is useful because you can put code and non-video data in bank B, though the CPU will still hit some slowdown because it must write to bank A in order to prepare the next frame while the current one is showing.
The Amiga is perhaps the best-known example of this design; the two banks were called chip memory (the first 512K) and fast memory (the rest), respectively. The 48K Spectrum was also a popular example; the two banks were the first 16K and the other 32K respectively.
It seems to me you could improve performance with a variant design: as above, except the video chip and the CPU each connect to both banks.
In the first frame, the video chip displays from bank A while the CPU prepares the next frame in bank B.
In the second frame, the video chip switches to displaying from bank B while the CPU prepares the next frame in bank A.
In the third frame, it switches back to the first arrangement, etc.
It seems to me this fairly minor variation on the historical design would allow the CPU to always work at full speed even when writing the data for the next frame. (In this context, I'm just thinking about scenarios where the CPU does that work, not ones where the video chip itself contains acceleration hardware that writes to video memory.)
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame? If not, why not? Is there some disadvantage I'm not taking into account?
hardware video
|
show 1 more comment
If a home computer or console has two banks of memory, A and B, then the following design is possible:
The video chip only connects to bank A. The CPU connects to both. During active scan line, CPU access to bank A is slow because it must share bandwidth with the video chip, but it can access bank B at full speed. This is useful because you can put code and non-video data in bank B, though the CPU will still hit some slowdown because it must write to bank A in order to prepare the next frame while the current one is showing.
The Amiga is perhaps the best-known example of this design; the two banks were called chip memory (the first 512K) and fast memory (the rest), respectively. The 48K Spectrum was also a popular example; the two banks were the first 16K and the other 32K respectively.
It seems to me you could improve performance with a variant design: as above, except the video chip and the CPU each connect to both banks.
In the first frame, the video chip displays from bank A while the CPU prepares the next frame in bank B.
In the second frame, the video chip switches to displaying from bank B while the CPU prepares the next frame in bank A.
In the third frame, it switches back to the first arrangement, etc.
It seems to me this fairly minor variation on the historical design would allow the CPU to always work at full speed even when writing the data for the next frame. (In this context, I'm just thinking about scenarios where the CPU does that work, not ones where the video chip itself contains acceleration hardware that writes to video memory.)
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame? If not, why not? Is there some disadvantage I'm not taking into account?
hardware video
4
Given that serial accesses are more likely than not, I always wondered why a design didn't use the low bit of addressing to select a RAM bank, and halt the CPU only if it is accessing the same one as video that cycle. But I think the answer always ends up back at fixed memory maps and caution about growing RAM sizes: it's impossible to consolidate two distinct banks of chips into a single bank if both need to be accessed simultaneously.
– Tommy
Dec 31 '18 at 22:09
4
It's an interesting question. The cost and performance should be compared with dual ported RAM which achieves the same result.
– traal
Dec 31 '18 at 22:12
1
@tofro Why? It's a general and accurate description – or, are you thinking of some more specialized usage of the term, that would not be applicable in this case?
– rwallace
Dec 31 '18 at 22:46
3
@tofro Erm. AFAICR he nevver says 'banked', only 'bank (of memory)'. These terms are not interchangable. A bank of memory is a seperate memory entity (usually seperate chips). It does not denote where (and when) it's located within a certain address space. In contrast 'banked' denotes a system where multiple banks share the same address space. So yes, the Amiga does have memory banks, and no, it is not a banked system.
– Raffzahn
Jan 1 at 0:45
1
this scheme actually requires 3 banks of memory to work: a pair of banks for frame buffers which are switched between, and a third bank for program data (and code). there are a lot of other things a program does besides accessing video memory, including the CPU reading the program instructions themselves from memory.
– Ken Gober
Jan 2 at 14:01
|
show 1 more comment
If a home computer or console has two banks of memory, A and B, then the following design is possible:
The video chip only connects to bank A. The CPU connects to both. During active scan line, CPU access to bank A is slow because it must share bandwidth with the video chip, but it can access bank B at full speed. This is useful because you can put code and non-video data in bank B, though the CPU will still hit some slowdown because it must write to bank A in order to prepare the next frame while the current one is showing.
The Amiga is perhaps the best-known example of this design; the two banks were called chip memory (the first 512K) and fast memory (the rest), respectively. The 48K Spectrum was also a popular example; the two banks were the first 16K and the other 32K respectively.
It seems to me you could improve performance with a variant design: as above, except the video chip and the CPU each connect to both banks.
In the first frame, the video chip displays from bank A while the CPU prepares the next frame in bank B.
In the second frame, the video chip switches to displaying from bank B while the CPU prepares the next frame in bank A.
In the third frame, it switches back to the first arrangement, etc.
It seems to me this fairly minor variation on the historical design would allow the CPU to always work at full speed even when writing the data for the next frame. (In this context, I'm just thinking about scenarios where the CPU does that work, not ones where the video chip itself contains acceleration hardware that writes to video memory.)
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame? If not, why not? Is there some disadvantage I'm not taking into account?
hardware video
If a home computer or console has two banks of memory, A and B, then the following design is possible:
The video chip only connects to bank A. The CPU connects to both. During active scan line, CPU access to bank A is slow because it must share bandwidth with the video chip, but it can access bank B at full speed. This is useful because you can put code and non-video data in bank B, though the CPU will still hit some slowdown because it must write to bank A in order to prepare the next frame while the current one is showing.
The Amiga is perhaps the best-known example of this design; the two banks were called chip memory (the first 512K) and fast memory (the rest), respectively. The 48K Spectrum was also a popular example; the two banks were the first 16K and the other 32K respectively.
It seems to me you could improve performance with a variant design: as above, except the video chip and the CPU each connect to both banks.
In the first frame, the video chip displays from bank A while the CPU prepares the next frame in bank B.
In the second frame, the video chip switches to displaying from bank B while the CPU prepares the next frame in bank A.
In the third frame, it switches back to the first arrangement, etc.
It seems to me this fairly minor variation on the historical design would allow the CPU to always work at full speed even when writing the data for the next frame. (In this context, I'm just thinking about scenarios where the CPU does that work, not ones where the video chip itself contains acceleration hardware that writes to video memory.)
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame? If not, why not? Is there some disadvantage I'm not taking into account?
hardware video
hardware video
asked Dec 31 '18 at 21:39
rwallacerwallace
7,887336110
7,887336110
4
Given that serial accesses are more likely than not, I always wondered why a design didn't use the low bit of addressing to select a RAM bank, and halt the CPU only if it is accessing the same one as video that cycle. But I think the answer always ends up back at fixed memory maps and caution about growing RAM sizes: it's impossible to consolidate two distinct banks of chips into a single bank if both need to be accessed simultaneously.
– Tommy
Dec 31 '18 at 22:09
4
It's an interesting question. The cost and performance should be compared with dual ported RAM which achieves the same result.
– traal
Dec 31 '18 at 22:12
1
@tofro Why? It's a general and accurate description – or, are you thinking of some more specialized usage of the term, that would not be applicable in this case?
– rwallace
Dec 31 '18 at 22:46
3
@tofro Erm. AFAICR he nevver says 'banked', only 'bank (of memory)'. These terms are not interchangable. A bank of memory is a seperate memory entity (usually seperate chips). It does not denote where (and when) it's located within a certain address space. In contrast 'banked' denotes a system where multiple banks share the same address space. So yes, the Amiga does have memory banks, and no, it is not a banked system.
– Raffzahn
Jan 1 at 0:45
1
this scheme actually requires 3 banks of memory to work: a pair of banks for frame buffers which are switched between, and a third bank for program data (and code). there are a lot of other things a program does besides accessing video memory, including the CPU reading the program instructions themselves from memory.
– Ken Gober
Jan 2 at 14:01
|
show 1 more comment
4
Given that serial accesses are more likely than not, I always wondered why a design didn't use the low bit of addressing to select a RAM bank, and halt the CPU only if it is accessing the same one as video that cycle. But I think the answer always ends up back at fixed memory maps and caution about growing RAM sizes: it's impossible to consolidate two distinct banks of chips into a single bank if both need to be accessed simultaneously.
– Tommy
Dec 31 '18 at 22:09
4
It's an interesting question. The cost and performance should be compared with dual ported RAM which achieves the same result.
– traal
Dec 31 '18 at 22:12
1
@tofro Why? It's a general and accurate description – or, are you thinking of some more specialized usage of the term, that would not be applicable in this case?
– rwallace
Dec 31 '18 at 22:46
3
@tofro Erm. AFAICR he nevver says 'banked', only 'bank (of memory)'. These terms are not interchangable. A bank of memory is a seperate memory entity (usually seperate chips). It does not denote where (and when) it's located within a certain address space. In contrast 'banked' denotes a system where multiple banks share the same address space. So yes, the Amiga does have memory banks, and no, it is not a banked system.
– Raffzahn
Jan 1 at 0:45
1
this scheme actually requires 3 banks of memory to work: a pair of banks for frame buffers which are switched between, and a third bank for program data (and code). there are a lot of other things a program does besides accessing video memory, including the CPU reading the program instructions themselves from memory.
– Ken Gober
Jan 2 at 14:01
4
4
Given that serial accesses are more likely than not, I always wondered why a design didn't use the low bit of addressing to select a RAM bank, and halt the CPU only if it is accessing the same one as video that cycle. But I think the answer always ends up back at fixed memory maps and caution about growing RAM sizes: it's impossible to consolidate two distinct banks of chips into a single bank if both need to be accessed simultaneously.
– Tommy
Dec 31 '18 at 22:09
Given that serial accesses are more likely than not, I always wondered why a design didn't use the low bit of addressing to select a RAM bank, and halt the CPU only if it is accessing the same one as video that cycle. But I think the answer always ends up back at fixed memory maps and caution about growing RAM sizes: it's impossible to consolidate two distinct banks of chips into a single bank if both need to be accessed simultaneously.
– Tommy
Dec 31 '18 at 22:09
4
4
It's an interesting question. The cost and performance should be compared with dual ported RAM which achieves the same result.
– traal
Dec 31 '18 at 22:12
It's an interesting question. The cost and performance should be compared with dual ported RAM which achieves the same result.
– traal
Dec 31 '18 at 22:12
1
1
@tofro Why? It's a general and accurate description – or, are you thinking of some more specialized usage of the term, that would not be applicable in this case?
– rwallace
Dec 31 '18 at 22:46
@tofro Why? It's a general and accurate description – or, are you thinking of some more specialized usage of the term, that would not be applicable in this case?
– rwallace
Dec 31 '18 at 22:46
3
3
@tofro Erm. AFAICR he nevver says 'banked', only 'bank (of memory)'. These terms are not interchangable. A bank of memory is a seperate memory entity (usually seperate chips). It does not denote where (and when) it's located within a certain address space. In contrast 'banked' denotes a system where multiple banks share the same address space. So yes, the Amiga does have memory banks, and no, it is not a banked system.
– Raffzahn
Jan 1 at 0:45
@tofro Erm. AFAICR he nevver says 'banked', only 'bank (of memory)'. These terms are not interchangable. A bank of memory is a seperate memory entity (usually seperate chips). It does not denote where (and when) it's located within a certain address space. In contrast 'banked' denotes a system where multiple banks share the same address space. So yes, the Amiga does have memory banks, and no, it is not a banked system.
– Raffzahn
Jan 1 at 0:45
1
1
this scheme actually requires 3 banks of memory to work: a pair of banks for frame buffers which are switched between, and a third bank for program data (and code). there are a lot of other things a program does besides accessing video memory, including the CPU reading the program instructions themselves from memory.
– Ken Gober
Jan 2 at 14:01
this scheme actually requires 3 banks of memory to work: a pair of banks for frame buffers which are switched between, and a third bank for program data (and code). there are a lot of other things a program does besides accessing video memory, including the CPU reading the program instructions themselves from memory.
– Ken Gober
Jan 2 at 14:01
|
show 1 more comment
4 Answers
4
active
oldest
votes
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame?
Not that I'm aware of. There were many designs with multiple banks, but never set up especially in the way that you ask for.
If not, why not? Is there some disadvantage I'm not taking into account?
Maybe because it's a very narrow solution for a problem that didn't really occur?
First of all, the influence of video access on a CPU is for most cases negligible. That is if there is already a separate bank, where the CPU can execute program/data without being slowed down by video. Even in a close loop CPU access to video memory is more in the region of every 10th cycle or less (*1), thus, if a video circuit allows at least some video access during display, no significant performance impact happens. Since most transactions on video memory are writes, a single write buffer may already provide a great relief.
Second, this solution will only speed up programs that use a two buffer based rendering scheme. They are a rather new development, only introduced with CPU/GPU setups fast enough to render a whole scene within 1/50th of a second or less from base data. Nothing early machines could do. Here it was faster to just manipulate the existing, single screen. Heck, real early ones had to use sprites as a crutch to overcome speed issues.
Third, use of double buffering isn't primarily (if at all) a solution to avoid performance slow down by video access during screen manipulation, but to avoid flicker. Double buffer handling does at first increase handling effort, as either full rendering or interleaved update (*2) is required. Its advantage is a decoupling in timing (below the level of a frame) between display and update (*3). With two buffers the CPU no longer needs to synchronize sequence and speed of screen update with video. Even frame drops are possible - not nice but possible - without harming the presented picture.
*1 - Yes, tighter examples can be constructed on several machines, but they are the exception in everyday tasks.
*2 - This means, when updating a buffer (the one not displayed), it does not represent the previous frame, but the one before that, so two counts of update related bookkeeping need to be held and managed. This can make code rather complicated - usually resulting in abstract engines - something again easy to do today, but hard to implement on power constrained systems back then.
*3 - With a single display buffer updates have to happen complete during retrace, or 'following the beam'. This means that during a frame only the parts that have already be displayed can be changed without generating flicker or ripple. The hurdle is to use as much time as possible, so keeping close to the line in display, but never overtake the video circuit. This includes looking ahead when it's about sprites and alike.
add a comment |
The Apple IIe with 80-column text card was capable of something like this. It came with a second, 64K memory space available through bank-switching. From page 23 of The Apple IIe Technical Reference Manual:
Data for the high-resolution graphics displays are stored in either of two 8192-byte areas in memory. These areas are called High-Resolution Page 1 and Page 2; think of them as buffers where you can put data to be displayed.
There were separate latches between the main memory and the alternate memory space and the display controller, and in double-high resolution, you would enable both latches, interlacing the bitmaps stored in each buffer. The two halves of video memory were split between $2000–$3FFF of the main memory space and the alternate memory space. See page 35 of the manual, among other places.
The Apple IIe did not, however, give faster access to banks of memory the current video mode was not using.
6
Though you get the same memory access speed regardless of which banks are on display.
– Tommy
Dec 31 '18 at 23:22
@Tommy Good point.
– Davislor
Jan 1 at 0:00
@Davislor: The use of two separate memory banks made it possible for the system to accommodate three memory operations per cycle using banks of memory that are only fast enough to accommodate two. I think the video fetches are done simultaneously rather than interleaved, but the concept is sound.
– supercat
Jan 1 at 18:46
@supercat The manual shows the timing and signals, so yes, that’s what the latches are for. To clarify: the design doesn’t give faster CPU access to the unused bank than to the used bank in single high resolution mode.
– Davislor
Jan 2 at 0:36
1
@Davislor: If the Apple had used one bank of memory on one frame, then one for the next, etc., clocking out data at the rate required for a maximum-resolution display, the CPU would have times when it would be able to access things at ~2MHz, but long intervals when it would get no access at all. Fetching a byte from each bank every ~microsecond minimizes the length of time between main-CPU access opportunities.
– supercat
Jan 2 at 18:05
|
show 1 more comment
In my opinion it's a system design question:
At the time of relevance of this question RAM was expensive, so storing a sequence of images directly in RAM for any significant sequence length was prohibitive if you intend to build a cheap system, so lets go with 2 frames
For cheap system the read/write speed (i.e. a Z80 or 6510 even only copying data to you video banks - probably in the 200kB/s range for a Z80, and on the order of 50-100 for a 6502) would be limited by the CPU/data bus (the DRAMS had access times significantly below 1 microsecond), meaning that storing 2 Frames wouldn't really help you a lot (since you could not update the buffer one more time)
The only way out to make sense of such a system would have been to give it multiple CPUs connected to one video buffer each. So that means that for a typical 8 bit system, the solution would not have helped much in any use case. I am unsure if specialized expensive video/graphics equipment did something like this (I could imagine that).
I don't get that. I was referring to the fastest copy operations which would be possible on such a CPU, given a fast bus speed. In the CPU cant write the data faster, then having different buffered for different frames does not solve a problem for most use cases.
– Sascha
Jan 1 at 18:42
Whoops. Misread your answer. It's fine! Carry on.
– wizzwizz4♦
Jan 1 at 18:42
add a comment |
Relatively few programs update the entire display every frame. When using a double-buffered display, anything that will be shown on two or more consecutive frames will need to be updated at least twice; it would be rare that writing something twice on inactive banks would be cheaper than writing it once on an active bank.
A somewhat more useful variation on your proposal would be to have one memory bank for the top half of the screen and one for the bottom. Many of Eugene Jarvis' games like Defender, Stargate, Robotron, etc. would write the bottom half of the screen while the beam was on the top half, and vice versa. I don't think the hardware used separate banks with the described access timings, but doing so would have made it possible to use slower DRAMs than would otherwise have been required.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "648"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fretrocomputing.stackexchange.com%2fquestions%2f8624%2fdid-any-machines-alternate-between-two-video-memory-banks%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame?
Not that I'm aware of. There were many designs with multiple banks, but never set up especially in the way that you ask for.
If not, why not? Is there some disadvantage I'm not taking into account?
Maybe because it's a very narrow solution for a problem that didn't really occur?
First of all, the influence of video access on a CPU is for most cases negligible. That is if there is already a separate bank, where the CPU can execute program/data without being slowed down by video. Even in a close loop CPU access to video memory is more in the region of every 10th cycle or less (*1), thus, if a video circuit allows at least some video access during display, no significant performance impact happens. Since most transactions on video memory are writes, a single write buffer may already provide a great relief.
Second, this solution will only speed up programs that use a two buffer based rendering scheme. They are a rather new development, only introduced with CPU/GPU setups fast enough to render a whole scene within 1/50th of a second or less from base data. Nothing early machines could do. Here it was faster to just manipulate the existing, single screen. Heck, real early ones had to use sprites as a crutch to overcome speed issues.
Third, use of double buffering isn't primarily (if at all) a solution to avoid performance slow down by video access during screen manipulation, but to avoid flicker. Double buffer handling does at first increase handling effort, as either full rendering or interleaved update (*2) is required. Its advantage is a decoupling in timing (below the level of a frame) between display and update (*3). With two buffers the CPU no longer needs to synchronize sequence and speed of screen update with video. Even frame drops are possible - not nice but possible - without harming the presented picture.
*1 - Yes, tighter examples can be constructed on several machines, but they are the exception in everyday tasks.
*2 - This means, when updating a buffer (the one not displayed), it does not represent the previous frame, but the one before that, so two counts of update related bookkeeping need to be held and managed. This can make code rather complicated - usually resulting in abstract engines - something again easy to do today, but hard to implement on power constrained systems back then.
*3 - With a single display buffer updates have to happen complete during retrace, or 'following the beam'. This means that during a frame only the parts that have already be displayed can be changed without generating flicker or ripple. The hurdle is to use as much time as possible, so keeping close to the line in display, but never overtake the video circuit. This includes looking ahead when it's about sprites and alike.
add a comment |
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame?
Not that I'm aware of. There were many designs with multiple banks, but never set up especially in the way that you ask for.
If not, why not? Is there some disadvantage I'm not taking into account?
Maybe because it's a very narrow solution for a problem that didn't really occur?
First of all, the influence of video access on a CPU is for most cases negligible. That is if there is already a separate bank, where the CPU can execute program/data without being slowed down by video. Even in a close loop CPU access to video memory is more in the region of every 10th cycle or less (*1), thus, if a video circuit allows at least some video access during display, no significant performance impact happens. Since most transactions on video memory are writes, a single write buffer may already provide a great relief.
Second, this solution will only speed up programs that use a two buffer based rendering scheme. They are a rather new development, only introduced with CPU/GPU setups fast enough to render a whole scene within 1/50th of a second or less from base data. Nothing early machines could do. Here it was faster to just manipulate the existing, single screen. Heck, real early ones had to use sprites as a crutch to overcome speed issues.
Third, use of double buffering isn't primarily (if at all) a solution to avoid performance slow down by video access during screen manipulation, but to avoid flicker. Double buffer handling does at first increase handling effort, as either full rendering or interleaved update (*2) is required. Its advantage is a decoupling in timing (below the level of a frame) between display and update (*3). With two buffers the CPU no longer needs to synchronize sequence and speed of screen update with video. Even frame drops are possible - not nice but possible - without harming the presented picture.
*1 - Yes, tighter examples can be constructed on several machines, but they are the exception in everyday tasks.
*2 - This means, when updating a buffer (the one not displayed), it does not represent the previous frame, but the one before that, so two counts of update related bookkeeping need to be held and managed. This can make code rather complicated - usually resulting in abstract engines - something again easy to do today, but hard to implement on power constrained systems back then.
*3 - With a single display buffer updates have to happen complete during retrace, or 'following the beam'. This means that during a frame only the parts that have already be displayed can be changed without generating flicker or ripple. The hurdle is to use as much time as possible, so keeping close to the line in display, but never overtake the video circuit. This includes looking ahead when it's about sprites and alike.
add a comment |
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame?
Not that I'm aware of. There were many designs with multiple banks, but never set up especially in the way that you ask for.
If not, why not? Is there some disadvantage I'm not taking into account?
Maybe because it's a very narrow solution for a problem that didn't really occur?
First of all, the influence of video access on a CPU is for most cases negligible. That is if there is already a separate bank, where the CPU can execute program/data without being slowed down by video. Even in a close loop CPU access to video memory is more in the region of every 10th cycle or less (*1), thus, if a video circuit allows at least some video access during display, no significant performance impact happens. Since most transactions on video memory are writes, a single write buffer may already provide a great relief.
Second, this solution will only speed up programs that use a two buffer based rendering scheme. They are a rather new development, only introduced with CPU/GPU setups fast enough to render a whole scene within 1/50th of a second or less from base data. Nothing early machines could do. Here it was faster to just manipulate the existing, single screen. Heck, real early ones had to use sprites as a crutch to overcome speed issues.
Third, use of double buffering isn't primarily (if at all) a solution to avoid performance slow down by video access during screen manipulation, but to avoid flicker. Double buffer handling does at first increase handling effort, as either full rendering or interleaved update (*2) is required. Its advantage is a decoupling in timing (below the level of a frame) between display and update (*3). With two buffers the CPU no longer needs to synchronize sequence and speed of screen update with video. Even frame drops are possible - not nice but possible - without harming the presented picture.
*1 - Yes, tighter examples can be constructed on several machines, but they are the exception in everyday tasks.
*2 - This means, when updating a buffer (the one not displayed), it does not represent the previous frame, but the one before that, so two counts of update related bookkeeping need to be held and managed. This can make code rather complicated - usually resulting in abstract engines - something again easy to do today, but hard to implement on power constrained systems back then.
*3 - With a single display buffer updates have to happen complete during retrace, or 'following the beam'. This means that during a frame only the parts that have already be displayed can be changed without generating flicker or ripple. The hurdle is to use as much time as possible, so keeping close to the line in display, but never overtake the video circuit. This includes looking ahead when it's about sprites and alike.
Did any historical machines work the way I suggest, with video chip and CPU switching back-and-forth between two banks each frame?
Not that I'm aware of. There were many designs with multiple banks, but never set up especially in the way that you ask for.
If not, why not? Is there some disadvantage I'm not taking into account?
Maybe because it's a very narrow solution for a problem that didn't really occur?
First of all, the influence of video access on a CPU is for most cases negligible. That is if there is already a separate bank, where the CPU can execute program/data without being slowed down by video. Even in a close loop CPU access to video memory is more in the region of every 10th cycle or less (*1), thus, if a video circuit allows at least some video access during display, no significant performance impact happens. Since most transactions on video memory are writes, a single write buffer may already provide a great relief.
Second, this solution will only speed up programs that use a two buffer based rendering scheme. They are a rather new development, only introduced with CPU/GPU setups fast enough to render a whole scene within 1/50th of a second or less from base data. Nothing early machines could do. Here it was faster to just manipulate the existing, single screen. Heck, real early ones had to use sprites as a crutch to overcome speed issues.
Third, use of double buffering isn't primarily (if at all) a solution to avoid performance slow down by video access during screen manipulation, but to avoid flicker. Double buffer handling does at first increase handling effort, as either full rendering or interleaved update (*2) is required. Its advantage is a decoupling in timing (below the level of a frame) between display and update (*3). With two buffers the CPU no longer needs to synchronize sequence and speed of screen update with video. Even frame drops are possible - not nice but possible - without harming the presented picture.
*1 - Yes, tighter examples can be constructed on several machines, but they are the exception in everyday tasks.
*2 - This means, when updating a buffer (the one not displayed), it does not represent the previous frame, but the one before that, so two counts of update related bookkeeping need to be held and managed. This can make code rather complicated - usually resulting in abstract engines - something again easy to do today, but hard to implement on power constrained systems back then.
*3 - With a single display buffer updates have to happen complete during retrace, or 'following the beam'. This means that during a frame only the parts that have already be displayed can be changed without generating flicker or ripple. The hurdle is to use as much time as possible, so keeping close to the line in display, but never overtake the video circuit. This includes looking ahead when it's about sprites and alike.
edited Jan 2 at 11:39
Community♦
1
1
answered Jan 1 at 0:40
RaffzahnRaffzahn
46.7k5104189
46.7k5104189
add a comment |
add a comment |
The Apple IIe with 80-column text card was capable of something like this. It came with a second, 64K memory space available through bank-switching. From page 23 of The Apple IIe Technical Reference Manual:
Data for the high-resolution graphics displays are stored in either of two 8192-byte areas in memory. These areas are called High-Resolution Page 1 and Page 2; think of them as buffers where you can put data to be displayed.
There were separate latches between the main memory and the alternate memory space and the display controller, and in double-high resolution, you would enable both latches, interlacing the bitmaps stored in each buffer. The two halves of video memory were split between $2000–$3FFF of the main memory space and the alternate memory space. See page 35 of the manual, among other places.
The Apple IIe did not, however, give faster access to banks of memory the current video mode was not using.
6
Though you get the same memory access speed regardless of which banks are on display.
– Tommy
Dec 31 '18 at 23:22
@Tommy Good point.
– Davislor
Jan 1 at 0:00
@Davislor: The use of two separate memory banks made it possible for the system to accommodate three memory operations per cycle using banks of memory that are only fast enough to accommodate two. I think the video fetches are done simultaneously rather than interleaved, but the concept is sound.
– supercat
Jan 1 at 18:46
@supercat The manual shows the timing and signals, so yes, that’s what the latches are for. To clarify: the design doesn’t give faster CPU access to the unused bank than to the used bank in single high resolution mode.
– Davislor
Jan 2 at 0:36
1
@Davislor: If the Apple had used one bank of memory on one frame, then one for the next, etc., clocking out data at the rate required for a maximum-resolution display, the CPU would have times when it would be able to access things at ~2MHz, but long intervals when it would get no access at all. Fetching a byte from each bank every ~microsecond minimizes the length of time between main-CPU access opportunities.
– supercat
Jan 2 at 18:05
|
show 1 more comment
The Apple IIe with 80-column text card was capable of something like this. It came with a second, 64K memory space available through bank-switching. From page 23 of The Apple IIe Technical Reference Manual:
Data for the high-resolution graphics displays are stored in either of two 8192-byte areas in memory. These areas are called High-Resolution Page 1 and Page 2; think of them as buffers where you can put data to be displayed.
There were separate latches between the main memory and the alternate memory space and the display controller, and in double-high resolution, you would enable both latches, interlacing the bitmaps stored in each buffer. The two halves of video memory were split between $2000–$3FFF of the main memory space and the alternate memory space. See page 35 of the manual, among other places.
The Apple IIe did not, however, give faster access to banks of memory the current video mode was not using.
6
Though you get the same memory access speed regardless of which banks are on display.
– Tommy
Dec 31 '18 at 23:22
@Tommy Good point.
– Davislor
Jan 1 at 0:00
@Davislor: The use of two separate memory banks made it possible for the system to accommodate three memory operations per cycle using banks of memory that are only fast enough to accommodate two. I think the video fetches are done simultaneously rather than interleaved, but the concept is sound.
– supercat
Jan 1 at 18:46
@supercat The manual shows the timing and signals, so yes, that’s what the latches are for. To clarify: the design doesn’t give faster CPU access to the unused bank than to the used bank in single high resolution mode.
– Davislor
Jan 2 at 0:36
1
@Davislor: If the Apple had used one bank of memory on one frame, then one for the next, etc., clocking out data at the rate required for a maximum-resolution display, the CPU would have times when it would be able to access things at ~2MHz, but long intervals when it would get no access at all. Fetching a byte from each bank every ~microsecond minimizes the length of time between main-CPU access opportunities.
– supercat
Jan 2 at 18:05
|
show 1 more comment
The Apple IIe with 80-column text card was capable of something like this. It came with a second, 64K memory space available through bank-switching. From page 23 of The Apple IIe Technical Reference Manual:
Data for the high-resolution graphics displays are stored in either of two 8192-byte areas in memory. These areas are called High-Resolution Page 1 and Page 2; think of them as buffers where you can put data to be displayed.
There were separate latches between the main memory and the alternate memory space and the display controller, and in double-high resolution, you would enable both latches, interlacing the bitmaps stored in each buffer. The two halves of video memory were split between $2000–$3FFF of the main memory space and the alternate memory space. See page 35 of the manual, among other places.
The Apple IIe did not, however, give faster access to banks of memory the current video mode was not using.
The Apple IIe with 80-column text card was capable of something like this. It came with a second, 64K memory space available through bank-switching. From page 23 of The Apple IIe Technical Reference Manual:
Data for the high-resolution graphics displays are stored in either of two 8192-byte areas in memory. These areas are called High-Resolution Page 1 and Page 2; think of them as buffers where you can put data to be displayed.
There were separate latches between the main memory and the alternate memory space and the display controller, and in double-high resolution, you would enable both latches, interlacing the bitmaps stored in each buffer. The two halves of video memory were split between $2000–$3FFF of the main memory space and the alternate memory space. See page 35 of the manual, among other places.
The Apple IIe did not, however, give faster access to banks of memory the current video mode was not using.
edited Jan 1 at 16:36
answered Dec 31 '18 at 22:26
DavislorDavislor
930210
930210
6
Though you get the same memory access speed regardless of which banks are on display.
– Tommy
Dec 31 '18 at 23:22
@Tommy Good point.
– Davislor
Jan 1 at 0:00
@Davislor: The use of two separate memory banks made it possible for the system to accommodate three memory operations per cycle using banks of memory that are only fast enough to accommodate two. I think the video fetches are done simultaneously rather than interleaved, but the concept is sound.
– supercat
Jan 1 at 18:46
@supercat The manual shows the timing and signals, so yes, that’s what the latches are for. To clarify: the design doesn’t give faster CPU access to the unused bank than to the used bank in single high resolution mode.
– Davislor
Jan 2 at 0:36
1
@Davislor: If the Apple had used one bank of memory on one frame, then one for the next, etc., clocking out data at the rate required for a maximum-resolution display, the CPU would have times when it would be able to access things at ~2MHz, but long intervals when it would get no access at all. Fetching a byte from each bank every ~microsecond minimizes the length of time between main-CPU access opportunities.
– supercat
Jan 2 at 18:05
|
show 1 more comment
6
Though you get the same memory access speed regardless of which banks are on display.
– Tommy
Dec 31 '18 at 23:22
@Tommy Good point.
– Davislor
Jan 1 at 0:00
@Davislor: The use of two separate memory banks made it possible for the system to accommodate three memory operations per cycle using banks of memory that are only fast enough to accommodate two. I think the video fetches are done simultaneously rather than interleaved, but the concept is sound.
– supercat
Jan 1 at 18:46
@supercat The manual shows the timing and signals, so yes, that’s what the latches are for. To clarify: the design doesn’t give faster CPU access to the unused bank than to the used bank in single high resolution mode.
– Davislor
Jan 2 at 0:36
1
@Davislor: If the Apple had used one bank of memory on one frame, then one for the next, etc., clocking out data at the rate required for a maximum-resolution display, the CPU would have times when it would be able to access things at ~2MHz, but long intervals when it would get no access at all. Fetching a byte from each bank every ~microsecond minimizes the length of time between main-CPU access opportunities.
– supercat
Jan 2 at 18:05
6
6
Though you get the same memory access speed regardless of which banks are on display.
– Tommy
Dec 31 '18 at 23:22
Though you get the same memory access speed regardless of which banks are on display.
– Tommy
Dec 31 '18 at 23:22
@Tommy Good point.
– Davislor
Jan 1 at 0:00
@Tommy Good point.
– Davislor
Jan 1 at 0:00
@Davislor: The use of two separate memory banks made it possible for the system to accommodate three memory operations per cycle using banks of memory that are only fast enough to accommodate two. I think the video fetches are done simultaneously rather than interleaved, but the concept is sound.
– supercat
Jan 1 at 18:46
@Davislor: The use of two separate memory banks made it possible for the system to accommodate three memory operations per cycle using banks of memory that are only fast enough to accommodate two. I think the video fetches are done simultaneously rather than interleaved, but the concept is sound.
– supercat
Jan 1 at 18:46
@supercat The manual shows the timing and signals, so yes, that’s what the latches are for. To clarify: the design doesn’t give faster CPU access to the unused bank than to the used bank in single high resolution mode.
– Davislor
Jan 2 at 0:36
@supercat The manual shows the timing and signals, so yes, that’s what the latches are for. To clarify: the design doesn’t give faster CPU access to the unused bank than to the used bank in single high resolution mode.
– Davislor
Jan 2 at 0:36
1
1
@Davislor: If the Apple had used one bank of memory on one frame, then one for the next, etc., clocking out data at the rate required for a maximum-resolution display, the CPU would have times when it would be able to access things at ~2MHz, but long intervals when it would get no access at all. Fetching a byte from each bank every ~microsecond minimizes the length of time between main-CPU access opportunities.
– supercat
Jan 2 at 18:05
@Davislor: If the Apple had used one bank of memory on one frame, then one for the next, etc., clocking out data at the rate required for a maximum-resolution display, the CPU would have times when it would be able to access things at ~2MHz, but long intervals when it would get no access at all. Fetching a byte from each bank every ~microsecond minimizes the length of time between main-CPU access opportunities.
– supercat
Jan 2 at 18:05
|
show 1 more comment
In my opinion it's a system design question:
At the time of relevance of this question RAM was expensive, so storing a sequence of images directly in RAM for any significant sequence length was prohibitive if you intend to build a cheap system, so lets go with 2 frames
For cheap system the read/write speed (i.e. a Z80 or 6510 even only copying data to you video banks - probably in the 200kB/s range for a Z80, and on the order of 50-100 for a 6502) would be limited by the CPU/data bus (the DRAMS had access times significantly below 1 microsecond), meaning that storing 2 Frames wouldn't really help you a lot (since you could not update the buffer one more time)
The only way out to make sense of such a system would have been to give it multiple CPUs connected to one video buffer each. So that means that for a typical 8 bit system, the solution would not have helped much in any use case. I am unsure if specialized expensive video/graphics equipment did something like this (I could imagine that).
I don't get that. I was referring to the fastest copy operations which would be possible on such a CPU, given a fast bus speed. In the CPU cant write the data faster, then having different buffered for different frames does not solve a problem for most use cases.
– Sascha
Jan 1 at 18:42
Whoops. Misread your answer. It's fine! Carry on.
– wizzwizz4♦
Jan 1 at 18:42
add a comment |
In my opinion it's a system design question:
At the time of relevance of this question RAM was expensive, so storing a sequence of images directly in RAM for any significant sequence length was prohibitive if you intend to build a cheap system, so lets go with 2 frames
For cheap system the read/write speed (i.e. a Z80 or 6510 even only copying data to you video banks - probably in the 200kB/s range for a Z80, and on the order of 50-100 for a 6502) would be limited by the CPU/data bus (the DRAMS had access times significantly below 1 microsecond), meaning that storing 2 Frames wouldn't really help you a lot (since you could not update the buffer one more time)
The only way out to make sense of such a system would have been to give it multiple CPUs connected to one video buffer each. So that means that for a typical 8 bit system, the solution would not have helped much in any use case. I am unsure if specialized expensive video/graphics equipment did something like this (I could imagine that).
I don't get that. I was referring to the fastest copy operations which would be possible on such a CPU, given a fast bus speed. In the CPU cant write the data faster, then having different buffered for different frames does not solve a problem for most use cases.
– Sascha
Jan 1 at 18:42
Whoops. Misread your answer. It's fine! Carry on.
– wizzwizz4♦
Jan 1 at 18:42
add a comment |
In my opinion it's a system design question:
At the time of relevance of this question RAM was expensive, so storing a sequence of images directly in RAM for any significant sequence length was prohibitive if you intend to build a cheap system, so lets go with 2 frames
For cheap system the read/write speed (i.e. a Z80 or 6510 even only copying data to you video banks - probably in the 200kB/s range for a Z80, and on the order of 50-100 for a 6502) would be limited by the CPU/data bus (the DRAMS had access times significantly below 1 microsecond), meaning that storing 2 Frames wouldn't really help you a lot (since you could not update the buffer one more time)
The only way out to make sense of such a system would have been to give it multiple CPUs connected to one video buffer each. So that means that for a typical 8 bit system, the solution would not have helped much in any use case. I am unsure if specialized expensive video/graphics equipment did something like this (I could imagine that).
In my opinion it's a system design question:
At the time of relevance of this question RAM was expensive, so storing a sequence of images directly in RAM for any significant sequence length was prohibitive if you intend to build a cheap system, so lets go with 2 frames
For cheap system the read/write speed (i.e. a Z80 or 6510 even only copying data to you video banks - probably in the 200kB/s range for a Z80, and on the order of 50-100 for a 6502) would be limited by the CPU/data bus (the DRAMS had access times significantly below 1 microsecond), meaning that storing 2 Frames wouldn't really help you a lot (since you could not update the buffer one more time)
The only way out to make sense of such a system would have been to give it multiple CPUs connected to one video buffer each. So that means that for a typical 8 bit system, the solution would not have helped much in any use case. I am unsure if specialized expensive video/graphics equipment did something like this (I could imagine that).
answered Jan 1 at 16:34
SaschaSascha
1112
1112
I don't get that. I was referring to the fastest copy operations which would be possible on such a CPU, given a fast bus speed. In the CPU cant write the data faster, then having different buffered for different frames does not solve a problem for most use cases.
– Sascha
Jan 1 at 18:42
Whoops. Misread your answer. It's fine! Carry on.
– wizzwizz4♦
Jan 1 at 18:42
add a comment |
I don't get that. I was referring to the fastest copy operations which would be possible on such a CPU, given a fast bus speed. In the CPU cant write the data faster, then having different buffered for different frames does not solve a problem for most use cases.
– Sascha
Jan 1 at 18:42
Whoops. Misread your answer. It's fine! Carry on.
– wizzwizz4♦
Jan 1 at 18:42
I don't get that. I was referring to the fastest copy operations which would be possible on such a CPU, given a fast bus speed. In the CPU cant write the data faster, then having different buffered for different frames does not solve a problem for most use cases.
– Sascha
Jan 1 at 18:42
I don't get that. I was referring to the fastest copy operations which would be possible on such a CPU, given a fast bus speed. In the CPU cant write the data faster, then having different buffered for different frames does not solve a problem for most use cases.
– Sascha
Jan 1 at 18:42
Whoops. Misread your answer. It's fine! Carry on.
– wizzwizz4♦
Jan 1 at 18:42
Whoops. Misread your answer. It's fine! Carry on.
– wizzwizz4♦
Jan 1 at 18:42
add a comment |
Relatively few programs update the entire display every frame. When using a double-buffered display, anything that will be shown on two or more consecutive frames will need to be updated at least twice; it would be rare that writing something twice on inactive banks would be cheaper than writing it once on an active bank.
A somewhat more useful variation on your proposal would be to have one memory bank for the top half of the screen and one for the bottom. Many of Eugene Jarvis' games like Defender, Stargate, Robotron, etc. would write the bottom half of the screen while the beam was on the top half, and vice versa. I don't think the hardware used separate banks with the described access timings, but doing so would have made it possible to use slower DRAMs than would otherwise have been required.
add a comment |
Relatively few programs update the entire display every frame. When using a double-buffered display, anything that will be shown on two or more consecutive frames will need to be updated at least twice; it would be rare that writing something twice on inactive banks would be cheaper than writing it once on an active bank.
A somewhat more useful variation on your proposal would be to have one memory bank for the top half of the screen and one for the bottom. Many of Eugene Jarvis' games like Defender, Stargate, Robotron, etc. would write the bottom half of the screen while the beam was on the top half, and vice versa. I don't think the hardware used separate banks with the described access timings, but doing so would have made it possible to use slower DRAMs than would otherwise have been required.
add a comment |
Relatively few programs update the entire display every frame. When using a double-buffered display, anything that will be shown on two or more consecutive frames will need to be updated at least twice; it would be rare that writing something twice on inactive banks would be cheaper than writing it once on an active bank.
A somewhat more useful variation on your proposal would be to have one memory bank for the top half of the screen and one for the bottom. Many of Eugene Jarvis' games like Defender, Stargate, Robotron, etc. would write the bottom half of the screen while the beam was on the top half, and vice versa. I don't think the hardware used separate banks with the described access timings, but doing so would have made it possible to use slower DRAMs than would otherwise have been required.
Relatively few programs update the entire display every frame. When using a double-buffered display, anything that will be shown on two or more consecutive frames will need to be updated at least twice; it would be rare that writing something twice on inactive banks would be cheaper than writing it once on an active bank.
A somewhat more useful variation on your proposal would be to have one memory bank for the top half of the screen and one for the bottom. Many of Eugene Jarvis' games like Defender, Stargate, Robotron, etc. would write the bottom half of the screen while the beam was on the top half, and vice versa. I don't think the hardware used separate banks with the described access timings, but doing so would have made it possible to use slower DRAMs than would otherwise have been required.
answered Jan 1 at 18:55
supercatsupercat
6,917736
6,917736
add a comment |
add a comment |
Thanks for contributing an answer to Retrocomputing Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fretrocomputing.stackexchange.com%2fquestions%2f8624%2fdid-any-machines-alternate-between-two-video-memory-banks%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
Given that serial accesses are more likely than not, I always wondered why a design didn't use the low bit of addressing to select a RAM bank, and halt the CPU only if it is accessing the same one as video that cycle. But I think the answer always ends up back at fixed memory maps and caution about growing RAM sizes: it's impossible to consolidate two distinct banks of chips into a single bank if both need to be accessed simultaneously.
– Tommy
Dec 31 '18 at 22:09
4
It's an interesting question. The cost and performance should be compared with dual ported RAM which achieves the same result.
– traal
Dec 31 '18 at 22:12
1
@tofro Why? It's a general and accurate description – or, are you thinking of some more specialized usage of the term, that would not be applicable in this case?
– rwallace
Dec 31 '18 at 22:46
3
@tofro Erm. AFAICR he nevver says 'banked', only 'bank (of memory)'. These terms are not interchangable. A bank of memory is a seperate memory entity (usually seperate chips). It does not denote where (and when) it's located within a certain address space. In contrast 'banked' denotes a system where multiple banks share the same address space. So yes, the Amiga does have memory banks, and no, it is not a banked system.
– Raffzahn
Jan 1 at 0:45
1
this scheme actually requires 3 banks of memory to work: a pair of banks for frame buffers which are switched between, and a third bank for program data (and code). there are a lot of other things a program does besides accessing video memory, including the CPU reading the program instructions themselves from memory.
– Ken Gober
Jan 2 at 14:01