Announcement

Collapse
No announcement yet.

Widget state pipelining

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Widget state pipelining

    I am using a gen4-uLCD-70D-SB with a Particle Photon and the library at https://github.com/4dsystems/ViSi-Ge...rticle-Library .
    I have made 4 ViSi-Genie widgets which I update via:

    Code:
    genie.WriteObject(GENIE_OBJ_ISMARTGAUGE, object_index, new_state);
    I expected that I could send updates to the smart display as quickly as the serial bus would take the messages, more or less. I observe 8,800 - 9,500 calls to
    Code:
    DoEvents(false)
    during each call to
    Code:
    Genie::WaitForIdle
    , which occurs for every
    Code:
    WriteObject
    call.

    Infrequently sending updated state works fine, and 1 widget works fine: but I need a high refresh rate on 6 of my visualizations. I've instrumented the Genie library to find the point of contention, can you help me break this 30 millisecond hard cap per-update?

    Below is a small excerpt from my logs for the 4 smart widgets. One by one they block and take 28-29 millis to idle, then the loop warn fires indicating that I cannot keep up with the rate of incoming events due to this busy loop of waiting for the idle state. Separate, more verbose logging confirms that the actual message itself, once the library allows it, takes about 1 millisecond to transit the wire, including the checksumming.

    Code:
    0000041384 [app.genie] INFO: Link idle achieved. State iterations: 8875, avg latency: 0, max latency: 1, operation latency: 28
    0000041384 [app.widgets.numeric] INFO: State update.  Old: 339, New: 339, Latency: 28ms
    0000041414 [app.genie] INFO: Link idle achieved.  State iterations: 9549, avg latency: 0, max latency: 1, operation latency: 29
    0000041415 [app.widgets.numeric] INFO: State update.  Old: 339, New: 339, Latency: 30ms
    0000041444 [app.genie] INFO: Link idle achieved.  State iterations: 9146, avg latency: 0, max latency: 1, operation latency: 28
    0000041445 [app.widgets.numeric] INFO: State update.  Old: 339, New: 339, Latency: 29ms
    0000041474 [app.genie] INFO: Link idle achieved.  State iterations: 9087, avg latency: 0, max latency: 1, operation latency: 28
    0000041474 [app.widgets.numeric] INFO: State update.  Old: 339, New: 339, Latency: 28ms
    WARN: Loop falling behind.  Target: 50, Actual: 119
    0000041503 [app.genie] INFO: Link idle achieved.  State iterations: 8784, avg latency: 0, max latency: 1, operation latency: 28
    0000041503 [app.widgets.numeric] INFO: State update.  Old: 368, New: 368, Latency: 28ms
    0000041533 [app.genie] INFO: Link idle achieved.  State iterations: 9546, avg latency: 0, max latency: 1, operation latency: 29
    0000041534 [app.widgets.numeric] INFO: State update.  Old: 368, New: 368, Latency: 30ms
    0000041563 [app.genie] INFO: Link idle achieved.  State iterations: 9170, avg latency: 0, max latency: 1, operation latency: 28
    0000041564 [app.widgets.numeric] INFO: State update.  Old: 368, New: 368, Latency: 29ms
    0000041593 [app.genie] INFO: Link idle achieved.  State iterations: 9132, avg latency: 0, max latency: 1, operation latency: 28
    0000041594 [app.widgets.numeric] INFO: State update.  Old: 368, New: 368, Latency: 29ms
    WARN: Loop falling behind.  Target: 50, Actual: 119
    0000041623 [app.genie] INFO: Link idle achieved.  State iterations: 9159, avg latency: 0, max latency: 1, operation latency: 29
    0000041623 [app.widgets.numeric] INFO: State update.  Old: 398, New: 398, Latency: 29ms
    0000041653 [app.genie] INFO: Link idle achieved.  State iterations: 9578, avg latency: 0, max latency: 1, operation latency: 29
    0000041654 [app.widgets.numeric] INFO: State update.  Old: 398, New: 398, Latency: 30ms
    0000041683 [app.genie] INFO: Link idle achieved.  State iterations: 9229, avg latency: 0, max latency: 1, operation latency: 28
    0000041684 [app.widgets.numeric] INFO: State update.  Old: 398, New: 398, Latency: 29ms
    0000041714 [app.genie] INFO: Link idle achieved.  State iterations: 9490, avg latency: 0, max latency: 1, operation latency: 29
    0000041715 [app.widgets.numeric] INFO: State update.  Old: 398, New: 398, Latency: 30ms
    WARN: Loop falling behind.  Target: 50, Actual: 122
    The impact of 30ms per widget is plainly visible at 4 widgets, and puts back-pressure on my upstream message bus. Once I have all 6 configured (and the additional 4 low-frequency widgets) this will be a major problem!
    Last edited by warriorofwire; 21st October 2017, 05:05 PM.

  • #2
    Hello,

    Welcome to the forum.

    Were your test's based on realtime data or simulated data that changes every loop cycle ? If all of the objects on screen are being upated even if the last value was the same, it causes an un-necessary slowdown to the objects that have changed in data. In short update ony the object that has had a change in value.

    I hope this helps

    Best regards

    Paul




    Comment


    • #3
      Hi Paul, thank you for looking into this!

      My system has 2 messages coming from the upstream device I'm monitoring. Each of those messages contains several (5 and 8) distinct measurements. Messages arrive in alternating order every 50ms.

      You can intuitively see that if I have 4 widgets that each block the serial bus for 30ms, even if they are optimally spread across the messages, I cannot keep up with the throughput requirement for the upstream message bus.

      My software does not request a redraw unless the monitored state is different from the last time it requested a redraw.

      The nature of the monitored system is such that when one thing is happening, several other things result in normal operation. 4 changing monitors from the same upstream message every 100ms is a very realistic expectation.

      My test is simulated, to measure the object draw throughput. The upstream message bus creates and passes data in exactly the same way as the simulation as far as the widgets are concerned. Every 50ms there is new state to be passed to the screen, invalidating 4-6 gauges.

      What is actually taking 30ms? Does that block for the screen to actually draw the update? Can we batch widget updates (30ms is less than 50 and could work)? What is the state of the screen's processor during those 30ms, can it accept more commands even if the client library doesn't technically allow it?

      Comment


      • #4
        I bypassed the library's state management in favor of simply writing messages to the serial bus and ignoring the acks by directly invoking
        Code:
        WriteObject (uint16_t object, uint16_t index, uint16_t data)
        This works for 1 refresh of the display. Adding a 40 millisecond delay (greater than observed p100 ACK latency) to the serial message works. Choosing a smaller value, like 30ms, results in a hand after a few moments. It seems that the receive buffer on the Diablo16 is not getting drained after a message is received.

        Basically, 30ms latency is okay for a screen state update, but not for a blocking single-gauge update. What will it take to produce a batch or pipeline widget update?
        Here's what I've done to scope this problem as tightly as possible:

        Code:
            #define lowByte(w) ((uint8_t)((w) & 0xFF))
            #define highByte(w) ((uint8_t)((w) >> 8))
            #define GENIE_WRITE_OBJ         1
            #define GENIE_ACK               0x06
            #define GENIE_NAK               0x15
        
            // Blocking WriteObject API.
            // Uses the conversational ACK/NAK byte from Genie to indicate when serial
            //  is available again.
            // Use this when you can't do anything else until the bus is available; e.g.,
            //   when you have multiple gauges you need to update at the same time.
            void WriteObject (uint16_t object, uint16_t index, uint16_t data) {
              log.trace("Sending state to Genie.");
              uint16_t msb, lsb ;
              uint8_t checksum ;
              lsb = lowByte(data);
              msb = highByte(data);
              serial->write(GENIE_WRITE_OBJ) ;
              checksum  = GENIE_WRITE_OBJ ;
              serial->write(object) ;
              checksum ^= object ;
              serial->write(index) ;
              checksum ^= index ;
              serial->write(msb) ;
              checksum ^= msb;
              serial->write(lsb) ;
              checksum ^= lsb;
              serial->write(checksum) ;
        
              log.trace("Waiting for Genie to ack the update.");
              int response;
              for (response = -1; response == -1; delay(1))
              {
                response = serial->read();
              }
              if      (response == GENIE_ACK) log.trace("ACK received.");
              else if (response == GENIE_NAK) log.error("Failed to write object, NAK received.");
              else log.error("Unknown response from Genie: %d", response);
            }
        And the corresponding logs look similar to the library, just with tons less code & state management to be concerned about:
        Code:
        0000034260 [app.widgets.numeric] TRACE: Selecting current state.
        0000034261 [app.widgets.numeric] TRACE: Current state selected
        0000034261 [app.widgets.numeric] TRACE: Writing new state: 1057
        0000034261 [app.widgets.numeric] TRACE: Sending state to Genie.
        0000034262 [app.widgets.numeric] TRACE: Waiting for Genie to ack the update.
        0000034294 [app.widgets.numeric] TRACE: ACK received.
        0000034294 [app.widgets.numeric] TRACE: Wrote new state
        0000034295 [app.widgets.numeric] INFO: State update.  Old: 1057, New: 1057, Latency: 34ms
        This uses the Particle logging system, left column is millis since application startup. You can infer timing from subsequent messages, like in this case it took 32 milliseconds to get an ACK byte after spending 1 millisecond computing a checksum and sending state.

        Comment


        • #5
          You send a command to the display, the display actions it and then sends an ACK

          If you ignore the ACK commands will build up on the display until its buffer overflows and then the display will either crash or get completely lost.

          You need to wait for the ACK

          If you make the gauges smaller you will get quicker ACKs.

          If you update unimportant objects less frequently you will hopefully be able to get the time to update the important ones at the frequency you need
          Mark

          Comment


          • #6
            I agree with everything you said up to the last point; the behavior of the display is as described.

            I have 4 important gauges that need to be updated frequently, i.e., 10 times per second each (30 updates per second is a noticeably better experience but 10 can suffice).
            The gauges are 200x171 pixels (does not seem excessive).
            The required update frequency is not possible with a 30ms duration per gauge occupying the serial bus exclusively. This only allows 33 draws per second and thereby causes a buildup of messages on the incoming message bus.

            I've shaved off several milliseconds by getting the fastest microSD card on the market, but it does not get down to 25ms average duration per gauge update. 4 important gauges and several unimportant gauges are involved here, but I'm focusing on the 4 important ones first because well... they're important.

            I'm looking for a creative solution: Is there a way to bulk-update smart gauges using a Diablo16?

            Comment


            • #7
              The time to update is all about the pixel size of each gauge, if you make them smaller they will update quicker, the baud rate (assuming 200kBaud) has almost nothing to do with it.

              Since the display uses SPI to communicate with the memory card, the speed rating of the card has almost nothing to do with it. Card manufacturers do not tend to publish the SPI speed rating of cards, the published ratings refer to SD mode.

              Ensure the files on the card are not fragmented and that the cluster size is 64k (use chkdsk *.*, it will show you both)
              Mark

              Comment


              • #8
                Yeah, the baud measures 0-1ms impact at 115200, not a lot of room for improvement there, even at 200k.

                I wasn't sure whether the replacement card would help but on average it did knock off 2ms and its worst-case performance is 5ms better. That is, the display ACKs 2ms faster on average and 5ms faster worse-case. I dunno, I'll keep using this card since it didn't hurt and gave ~6% performance increase.

                Sounds like bulk-update and pipelining are dead-ends. I'll get to work on a QOS framework to budget time slices to gauges & try to dial the performance in as tight as possible via other means (like the other thread, trying RAW mode). Thanks for your time on this Mark! I'd love to +1 a feature (or new product) request for an even smarter display with gigabytes of RAM for insta-draws.

                Comment


                • #9
                  I'm also running into this problem, updating multiple knobs blocks program execution and thus interferes with other tasks. I'm working on a MIDI DAW (digital audio workstation) controller with a Teensy 3.6 for the processing.
                  Click image for larger version

Name:	Zeus-DPC-Lexicon-front.jpg
Views:	74
Size:	254.2 KB
ID:	61630

                  The rotary encoders are not directly controlling the virtual knobs on the display. The encoders send MIDI pitch-change data to the DAW and the DAW will echo these MIDI data back to the Teensy. The Teensy then sends the commands to the display. In addition the DAW sends the values as text. The text strings are all separate objects, only objects that need to be updated are. Turning one encoder works excellent, the virtual knob perfectly follows the movement of the encoder (high resolution Bourns EM14 optical) even if you turn the encoder very fast but if two encoders are turned simultaneously serious delays occur.
                  Is there a way to make these calls non blocking by adapting the library to this specific use? Perhaps it is possible to implement some kind of command stack for the objects on the screen. The stack would contain one entry for each unique object. If the program pushes updates faster than the display can process the entry in the stack for the object will be updated. This means dropping states that couldn't be handled in time but all objects will have the correct state in the end. Communication with the displays would be based on a timer and could be non blocking.
                  Would this be possible?

                  Kind regards,

                  Gerrit

                  Comment


                  • #10
                    which library are you using? genie? beta? or the bypass posted previous posts back?

                    Comment


                    • #11
                      Originally posted by tonton81 View Post
                      which library are you using? genie? beta? or the bypass posted previous posts back?
                      The VisiGenie library, the files have a jan 2016 timestamp. No changes to the library.

                      Kind regards,

                      Gerrit

                      Comment


                      • #12
                        theres a beta one on the repo posted 5 months ago, willing to test it?

                        https://github.com/4dsystems/ViSi-Ge...o-Library-BETA

                        Comment


                        • #13
                          Well Gerrit, to resolve the issue for my latency-sensitive display with high instrument surface area, I switched from visi genie to the serial environment, wrote a library https://github.com/WarriorOfWire/photon_diablo16_serial , implemented tools and coordinate templating systems and a global "dirty rectangle snippet" registry along with a widget/scene/data binding framework.

                          I've gotten 8 gauges streaming updates at under 20ms latency per frame. Terrible cell phone pic: https://photos.app.goo.gl/YJurxWnw6mn5Ne7Z2

                          It's a major hassle, but I could not use visi genie due to its policy of naively drawing a whole widget with every state update. Pixel updates on the Diablo16 are precious, you can only get 3fps if you update every pixel on every refresh. Each gauge in my system requires single-digit-milliseconds latency!

                          For your use, I imagine you'll eventually have to decide if you want to make your gauges smaller or if you want to do some heavy lifting. :-) I hope for your sake I'm wrong!

                          Comment


                          • #14
                            Originally posted by tonton81 View Post
                            theres a beta one on the repo posted 5 months ago, willing to test it?

                            https://github.com/4dsystems/ViSi-Ge...o-Library-BETA
                            I tried the beta but there's something interfering with normal program execution as the encoders no longer have the same range they used to have. I have no clue what's causing this. The updates seems to be faster but it's hard to judge because of the encoders behaving different. I used the encoder library by Paul Stoffregen. The encoders are all optical so there's no bouncing or skipping.

                            Well Gerrit, to resolve the issue for my latency-sensitive display with high instrument surface area, I switched from visi genie to the serial environment, wrote a library https://github.com/WarriorOfWire/photon_diablo16_serial , implemented tools and coordinate templating systems and a global "dirty rectangle snippet" registry along with a widget/scene/data binding framework.

                            I've gotten 8 gauges streaming updates at under 20ms latency per frame. Terrible cell phone pic: https://photos.app.goo.gl/YJurxWnw6mn5Ne7Z2

                            It's a major hassle, but I could not use visi genie due to its policy of naively drawing a whole widget with every state update. Pixel updates on the Diablo16 are precious, you can only get 3fps if you update every pixel on every refresh. Each gauge in my system requires single-digit-milliseconds latency!

                            For your use, I imagine you'll eventually have to decide if you want to make your gauges smaller or if you want to do some heavy lifting. :-) I hope for your sake I'm wrong!
                            Good to hear you were able to solve your problem. Hope I don't have to go that far.

                            Kind regards,

                            Gerrit

                            Comment


                            • #15
                              Originally posted by Gerrit View Post
                              Is there a way to make these calls non blocking by adapting the library to this specific use? Perhaps it is possible to implement some kind of command stack for the objects on the screen. The stack would contain one entry for each unique object. If the program pushes updates faster than the display can process the entry in the stack for the object will be updated.
                              There have been thoughts about the idea of non blocking writes, but it is fraught with difficulty, and, on a low end Arduino you will run out of memory very quickly.
                              You also try a transmit queue, before adding something to the queue, you check the queue to see if the 'message' already exists with a different value, if so just update the value and don't add the new 'message'. The 'old' library already does this on received messages and it works well.

                              Mark

                              Comment

                              Working...
                              X