- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Forum readers
I've been scrathing my head for some time over this. I have a custom board that gives me a periodic interrupt. I've written a ISR- DSR handler pair and they work - sometimes. I've stripped the code down and the two examples is the essence of my problem. The first one works as expected - in the second the DSR is never called, but apart from that the rest of the system performs as it should. As seen the ONLY change is that abe++ has been replaced with 3 nops - just to make it easier to compare objdumps. The objdumps of the complete program are identical apart from the sections shown, so the all variables, functions etc reside in identical places. The compiler has chosen different registers in the two examples, but I really cannot see anything illigal in this assembler code. Please have a look - this is beyond me http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/ohmy.gif -- working --cyg_uint32 avr_isr_function(cyg_vector_t vector, cyg_addrword_t data)
{
cyg_interrupt_mask( vector );
cyg_interrupt_acknowledge( vector );
// Acknowlege
Data = IORD_32DIRECT(AVRIF_BASE,0);
for (i=0;i<8;i++) {
Data = IORD_32DIRECT(AVRIF_BASE,i*4+4);
IOWR_32DIRECT(AVRIF_BASE,i*4+4,buf1);
asm("nop;nop;nop");
} // endfor
abe += 8;
return CYG_ISR_CALL_DSR;
}
void avr_dsr_function(cyg_vector_t vector, cyg_ucount32 count, cyg_addrword_t data)
{
// Unmask it
abe++;
cyg_interrupt_unmask( vector );
}
objdump
Data = IORD_32DIRECT(AVRIF_BASE,i*4+4);
8004b0: d3601917 ldw r13,-32668(gp)
8004b4: 681890ba slli r12,r13,2
8004b8: 62066104 addi r8,r12,6532
8004bc: 42800037 ldwio r10,0(r8)
IOWR_32DIRECT(AVRIF_BASE,i*4+4,buf1);
8004c0: 6197883a add r11,r12,r6
8004c4: 5a400017 ldw r9,0(r11)
8004c8: d2a01a15 stw r10,-32664(gp)
8004cc: 42400035 stwio r9,0(r8)
asm("nop;nop;nop");
8004d0: 0001883a nop
8004d4: 0001883a nop
8004d8: 0001883a nop
8004dc: d1601917 ldw r5,-32668(gp)
8004e0: 29000044 addi r4,r5,1
8004e4: d1201915 stw r4,-32668(gp)
8004e8: 393ff10e bge r7,r4,8004b0 <_Z16avr_isr_functionjj+0x38>
-- disfunctional-- cyg_uint32 avr_isr_function(cyg_vector_t vector, cyg_addrword_t data)
{
cyg_interrupt_mask( vector );
cyg_interrupt_acknowledge( vector );
// Acknowlege
Data = IORD_32DIRECT(AVRIF_BASE,0);
for (i=0;i<8;i++) {
Data = IORD_32DIRECT(AVRIF_BASE,i*4+4);
IOWR_32DIRECT(AVRIF_BASE,i*4+4,buf1);
abe++;
} // endfor
abe += 8;
return CYG_ISR_CALL_DSR;
}
void avr_dsr_function(cyg_vector_t vector, cyg_ucount32 count, cyg_addrword_t data)
{
// Unmask it
abe++;
cyg_interrupt_unmask( vector );
}
objdump:
Data = IORD_32DIRECT(AVRIF_BASE,i*4+4);
8004b0: d3e01917 ldw r15,-32668(gp)
8004b4: 781c90ba slli r14,r15,2
8004b8: 72866104 addi r10,r14,6532
8004bc: 53000037 ldwio r12,0(r10)
IOWR_32DIRECT(AVRIF_BASE,i*4+4,buf1);
8004c0: 719b883a add r13,r14,r6
8004c4: 6ac00017 ldw r11,0(r13)
8004c8: d3201a15 stw r12,-32664(gp)
8004cc: 52c00035 stwio r11,0(r10)
abe++;
8004d0: d2600417 ldw r9,-32752(gp)
8004d4: d2201917 ldw r8,-32668(gp)
8004d8: 49400044 addi r5,r9,1
8004dc: 41000044 addi r4,r8,1
8004e0: d1600415 stw r5,-32752(gp)
8004e4: d1201915 stw r4,-32668(gp)
8004e8: 393ff10e bge r7,r4,8004b0 <_Z16avr_isr_functionjj+0x38>
Link Copied
10 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am confused as to the purpose of 'abe'. But one thing that bothers me is that it is being messed with in two locations. Hopefully, that isn't a problem since the IRQ in question is masked.
Try returning (CYG_ISR_HANDLED | CYG_ISR_CALL_DSR) from your ISR.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- originally posted by mike desimone@Nov 8 2005, 06:32 PM i am confused as to the purpose of 'abe'. but one thing that bothers me is that it is being messed with in two locations. hopefully, that isn't a problem since the irq in question is masked.
try returning (cyg_isr_handled | cyg_isr_call_dsr) from your isr.
<div align='right'><{post_snapback}> (index.php?act=findpost&pid=10862)
--- quote end ---
--- Quote End --- Dear Mike The process of isolating the problem has stripped the code for all meaningfull purposes. As it is it just mask the interrupt, which is then unmasked in the DSR. The abe variable is declared volatile and some other process does a printf. I shifted from the Nios 1 because I saw some very rare but unexplainable behaviour where the compiler just generated nonsense code. When I inserted debug code the problem vanished. As I have this code in production and want to be able to make minor changes without worrying about the whole thing breaking in some unrelated place, and because of the obvious benefits of Nios II, I decided to try an upgrade. When I see stuff like this and cannot explain it I fear the same thing happening again. As I have a huge code base and porting it from Nios1 to NiosII and eCos is quite an investment I'd like it to solve my problems. In the examples the return codes are identical - but one works the other dont. The differences are in the ISR code alone which runs with interrupts disabled, so I have a hard time seeing that a context switch or the like could mess something up. Also the timing must be close to identical in the two code snippets. The compiler chooses to use some other registers but as far as I can see in the objdumps all these registers should be stored/restored during a isr. BTW I've tried to return both values but it makes no difference http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/dry.gif If the DSR mechanism was just plain broken I could manage. Hope that some gifted person can correct whatever I do wrong.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If the code looks OK, what about the state of the registers?
To learn more, try explicitly disabling interrupts around the DSR "abe++".- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- originally posted by tns1@Nov 9 2005, 10:42 AM if the code looks ok, what about the state of the registers?
to learn more, try explicitly disabling interrupts around the dsr "abe++".
<div align='right'><{post_snapback}> (index.php?act=findpost&pid=10880)
--- quote end ---
--- Quote End --- The problem has been located. Stepping through the assembler code I noticed that the index into the hal_interrupt_data is calculated and stored in r15. Then a call to the isr takes place and upon return r15 is used again for finding the DSR and calling the interrupt_end function. If the ISR uses r15 without restoring it this scenario will obviously crash in many flavours http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif In the broken code r15 is used by the compiler and not restored which is perfectly legal according to the register usage table 7-2 in nios ii processor reference handbook. Depending on the kind of C code you write in the ISR you can have r15 corrupted or not. I've just modified the vector.s code to use r16 instead of r15 pushing it onto the stack etc and that did the trick. in my humble opinion this seems to be a major bug in the nios port of ecos. Looking forward to having your opinions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- originally posted by jskjoet@Nov 9 2005, 11:56 AM the problem has been located.
stepping through the assembler code i noticed that the index into the hal_interrupt_data is calculated and stored in r15.
then a call to the isr takes place and upon return r15 is used again for finding the dsr and calling the interrupt_end function.
if the isr uses r15 without restoring it this scenario will obviously crash in many flavours http://forum.niosforum.com/work2/style_emoticons/<#emo_dir#>/smile.gif
in the broken code r15 is used by the compiler and not restored which is perfectly legal according to the register usage table 7-2 in nios ii processor reference handbook.
depending on the kind of c code you write in the isr you can have r15 corrupted or not.
i've just modified the vector.s code to use r16 instead of r15 pushing it onto the stack etc and that did the trick.
In my humble opinion this seems to be a major bug in the Nios port of ecos. [/b] Looking forward to having your opinions. <div align='right'><{post_snapback}> (index.php?act=findpost&pid=10881) --- Quote End --- [/b] --- Quote End --- jskjoet, Thank you for isolating this. We are getting the ecos package for Nios II 5.1 together now and this is one item being looked at.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jesse
Pleased to be able to help. Anyway I think I might have found another although minor thing you might want to consider in a new release unless I've completely misunderstood something. _interrupt_handler enables interrupts after the isr has been called but before posting the dsr (the comments wrongly states that the dsr is called - but it is just added to the list - right ? ). If we have an interrupt storm this means that each interrupt takes up app 76 bytes on the stack of the thread being interrupted - this might pose a problem. Why not wait and let the eret instruction reenable ints - to prevent eating stack in case of an interrupt storm ? The penalty would be a small latency increase, but it would allow you to decide on a stack size based only on the thread variables and function called from the thread. As implemented you'll have to consider if interrupts could arive 'back to back' and how many times - each eating app 76 bytes.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- originally posted by jskjoet@Dec 3 2005, 09:42 PM _interrupt_handler enables interrupts after the isr has been called but before posting the dsr (the comments wrongly states that the dsr is called - but it is just added to the list - right ? ).
if we have an interrupt storm this means that each interrupt takes up app 76 bytes on the stack of the thread being interrupted - this might pose a problem.
why not wait and let the eret instruction reenable ints - to prevent eating stack in case of an interrupt storm ? --- Quote End --- Correct me if I'm wrong, but isn't this why the ISR masks its interrupt and the DSR unmasks it? If that same interrupt comes in between ISR and DSR, it has to wait for unmasking, at which point the DSR should be mostly done.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mike
If you use the ISR/DSR setup you can do it that way, and there's no problem. But if you have a fast lowlevel interrupt you cannot aford waiting for the DSR to reenable the irq. I do not claim that the approach is an error just that each irq eats 80 bytes of stack and that you'll have to allow for this mem usage in each stack in every thread. If you have interrupts comming back to back, which is quite normal, the stackpointer will not be restored between each of them and eventually you'll crash.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- originally posted by jskjoet@Dec 6 2005, 03:33 PM hi mike
if you use the isr/dsr setup you can do it that way, and there's no problem. but if you have a fast lowlevel interrupt you cannot aford waiting for the dsr to reenable the irq.
i do not claim that the approach is an error just that each irq eats 80 bytes of stack and that you'll have to allow for this mem usage in each stack in every thread.
if you have interrupts comming back to back, which is quite normal, the stackpointer will not be restored between each of them and eventually you'll crash.
<div align='right'><{post_snapback}> (index.php?act=findpost&pid=11370)
--- quote end ---
--- Quote End --- jskjoet, I see your point, but it seems that eCos has this worked out by means of the (configurable) separate interrupt stack. I found this a good read, in case you haven't already seen it: http://ecos.sourceware.org/docs-1.3.1/ref/ecos-ref.c.html (http://ecos.sourceware.org/docs-1.3.1/ref/ecos-ref.c.html) It also seems to me that if this is a genuine problem it may be time to take another look at the system: Is the ISR (not DSR) code taking too long to execute? Can anything be done to slow the rate of interrupts? At some point the processor will be choked no matter what; I think that eCos is trying to be clever here with the separate isr/dsr architecture specifically to allow you to make your ISR code as brief as possible. Thats just my interpretation though. One big caveat: I'm just getting into eCos now (here at Altera) and haven't yet done what I'd consider a complex design with it yet.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jesse
Thank you for the reply. This is not a real problem in my PDA app, but just something I noticed reading the code when finding out why my DSR routines did not work. I thought it might be of interrest for the ecos porting team. As I read the code, and please correct me if I'm wrong, the stack is eaten from whatever thread that was active, NOT from the dedicated interrupt stack. In fact I have a hard time figuring out why the dedicated interrupt stack is worthwhile as the ISR is called with interrupts disabled, so unless you specifically enable interrupts again in your ISR, maximum ISR stack usage is quite predictable. My point is that it would make more sense to wait until the stack has been reclaimed before enabling irq's.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page