Copying Large Matrices in Memory
ThresholdMy latest release Archangel Alpha Version 0.1 makes use of my one-and-only VB lighting engine (at least, the one-and-only dynamic 2D lighting I've ever seen in any VB game, or any 2D game for that matter). However, I've come accross some problems. The lighting engine makes use of a MASSIVE 3 demensional array (light map) which stores color values for each vertex of each tile. In order to minimize calculation during run-time, the game editor allows you to set static lights whose lighting is calculated once during run-time and then stored in a memory array. Provided static lights are being used, the actual light map is reset back to this memory array each time the light map is cleared. Just as in clearing the back buffer with any rendering, the light map must be cleared and recalculated (at least, a portion of it) every time a dynamic light (non-static) moves, changes range, changes color, etc. This poses a big problem when it comes to large maps (say 100x100 with 32x32 tiles): a) Calculating the the entire map's static lights can be very slow (with 58 lights on a 100x100 map it can take as much as 10 seconds to calculate all static lights; I do have optimizations in progress for this however) b) With a map of that size, the light map is approximately 1.2 MB in memory. When clearing the light map (by reassigning it the memory light map) I have to do the following (NOTE: Light map arrays are contained in UDTs for ease of copying): LightMap = LightMem_Map With a map of 100x100 tiles (and the array containing 4 3rd demensional elements for each vertex...and yes I don't actually need 4 for each tile since tiles share boundaries; I'll be fixing that soon) that small line can take up to 20 ms. That may not sound like a long time, but when it's being done every frame (provided a dynamic light is changing) it really hurts the frame rate (20 ms is 1/50 if a second). Remember, with a map of 100x100 tiles, the array is 100x100x4. And given that the lighting data (8 Long variables contained in a UDT) takes up 32 (8x4) bytes that's 100x100x4x32 = 1280000 bytes = 1.221 MB. I need to find a solution. The things I've thought of so far are: a) Find a faster way to copy large arrays (I don't know how to use the API CopyMemory in this context) b) Find a fast way to only clear (copy) the portion of the array which is visible on the screen without (or maybe with) a loop going through each element representing a value which will effect the screen. c) Find a way to store the light map differently Thanks. All ideas and questions are welcome.
Eric ColemanTips for speed: 1. Don't resize your array. To gain speed you have to define some limitations in your program. Special cases can always be optimized better than general solutions. 2. Limit the range of your lights. If they're limited enough then you can use "dirty rectangle" algorithms. If a light has an actual size or area or volume, then you can limit your calculations to that area only. 3. Store an extra number in the UDT, use a LONG for memory alignment purposes, and keep track of the number of lights that influence a vertex. With this information you can easily "subtract" lights, and this would work well with the dirty rectangle methods. Since the vertex contains the average of different lights, you can calculate the influence of a light that you want to subtract (that was previously added) and easily subtract it. If you have an average that was precalculated as x = (a+b+c)/3 , then to remove any of the components from X you simply use x = (x - a / 3) * (3/2), x = (x-b/2) * (2/1). If a light moves, then you can easily subjract the light from it's previous position, and then move the light, then add it back to the light map.
ThresholdI don't see how I could clear my array any other way that erasing it and resizing it, or assigning it an empty variable (defined with the UDT) which would have to resized anyway. It is only resized for clearing when there is no memory light map containing static light information. Otherwise, it copies the memory light map to the actual light map in order to "clear" it or set it back to the default lighting. If I'm visualizing the "dirty rectangle" idea correctly, I believe I'm already using it. Every time the light is updated, its range rectangle is updated as well. This is a RECT determined by adding and subtracting the range value from the X and Y of the light's location. When it comes to static lights, it is only within this region that the lighting for that light is calculated. For dynamic lights, I use IntersectRECT and only calculate lighting for tiles that are common to the light's range and the screen. For tip 3, I would indeed have to limit the engine immensly. Considering lights can have different colors, locations, and ranges, merely subtracting the influence of a light would be difficult and impossible in most situations. It would only work if the light in consideration has moved far enough away from a specific vertex that it had no effect (which would rarely happen within a 2 frame time period), or the light was turned off since the last update. Otherwise, in order to calculate the vertex's lighting, I have to determine the distance between it and the light in consideration, apply this to it's fading value for color components, and then TEST to see if it will effect the vertex. For example, if a given white light is shining from origin (x,y) and another red light shining from (x+20, y+20), the red light will NOT be noticed within the majority of the white light's range. Why? Because the white light already HAS red. Only if a given vertex's current red value is LESS than what its red value would be with the application of the red lighting would it actually become more red. That is why I need a light map, so that multiple lights will blend. I set all vertex's colors to black (0) or the ambient color, then apply the first light. Only the components (R, G, and B) of the second light that are greater than each vertex's current component color values will be applied. This is very confusing to describe, sorry. Perhaps pseude-code would help: [code] For i = FirstVertex To LastVertex AppliedColor = Distance(Light.X, Light.Y, Vertex(i).X, Vertex(i).Y) * Light.Fade If AppliedColor > Vertex(i).Red Then Vertex(i).Red = AppliedColor If AppliedColor > Vertex(i).Green Then Vertex(i).Green = AppliedColor If AppliedColor > Vertex(i).Blue Then Vertex(i).Blue = AppliedColor Next i [/code] It's actually far more complicated than this (AppliedColor should have some relation to each color component in the light's original color), but it get's the idea accross. I hope that by this description, the process of lighting becomes clearer. Maybe I am misunderstanding the concept of "dirty rectangles" and/or how I could subtact the influence of a light. Thanks for your help!
Eric ColemanI think I don't fully understand your concept of a "lightmap". My original thought was that you had an array of color information like so (in pseudo code) [code] UDT r as single g as single b as single End udt LightMap(MapsizeX, MapsizeY) as UDT Map(MapsizeX, MapsizeY) as Maptype For Each Light Add Data to LightMap Apply Lightmap Information to Vertices in Map Render Map [/code]
ThresholdRight. That's basically the idea. Here's the code definition, then I'll explain it. [code] ' Lighting Public Type LIGHT_PROPS X As Single ' Location on X axis Y As Single ' Location on Y axis ColR As Long ' Red component in light color ColG As Long ' Green component in light color ColB As Long ' Blue component in light color SpecR As Long ' Red specular component SpecG As Long ' Green specular component SpecB As Long ' Blue specular component Range As Single ' Maximum range of the light (in pixels) Penetrate As Single ' (Unsupported) Percent of light reduction when going _ through walls (100 means it does not shine _ through walls; this effect can also be _ affected by tile properties) Enabled As Boolean ' Is the light on? Static As Boolean ' Is the light static? ' x members store data for optimization (Call UpdateLight every time _ something changes for that light) xFade As Single ' Fade value (measured in color units (0 to 255) _ per pixel) xHomeX As Long ' X location (in tiles) xHomeY As Long ' Y location (in tiles) xBigCol As Long ' Value of the brightest color component in the _ light's color xBigSpec As Long ' Value of the brightest specular component xRangeRECT As RECT ' Range rectangle (in tiles) End Type Public Type LIGHT_COL_NODE ColR As Long ' Red component in color ColG As Long ' Green component in color ColB As Long ' Blue component in color Color As Long ' Color of light node SpecR As Long ' Specular red component SpecG As Long ' Specular green component SpecB As Long ' Specular blue component Specular As Long ' Specular of light node End Type Public Type LIGHT_COLOR_MAP tMapWidth As Long ' Map width in tiles tMapHeight As Long ' Map height in tiles lPrevUpdate As Long ' Tick Count of previous update bRequireSpec As Boolean ' Does the map require specular _ to be turned on to render accurately? Nodes() As LIGHT_COL_NODE End Type Public LightMap As LIGHT_COLOR_MAP ' Color information for each lighting node Private LightMem_Retained As Boolean ' Has the memory light map been retained? Private LightMem_Map As LIGHT_COLOR_MAP ' A memory of the basic light color map [/code] The first UDT is a description of a light. I merely use a dynamic array to store lights (I'm no big fan of classes). The second is a description of a light color node (vertex). It stores the R, G, and B components of both the color and the specular of the vertex and stores their actual color value (to keep from doing it during rendering and wasting time). Third is the description of a light map. The most noticable element being the dynamic Nodes array. This is a 3 dimensional array (i, j, k) in which i and j specify x and y "coordinates" within the array which correspond to a location map (not shown) merely giving X and Y pixel coordinates for each vertex; k = 0 to 3, the vertex (4 vertices per square). K could be optimized, as I have stated, because tiles share boundaries and therefore it is useless to calculate lighting for the top right-most vertex of tile (X, Y) AND the top left-most vertex of the tile (X + 1, Y). Finally are the variables used in the lighting engine. LightMap is the lightmap which is used for rendering, and LightMem_Map is the exact same size but stores ONLY the lighting of the static lights. For clearing LightMap when static lights are present, LightMap is set equal to LightMem_Map. To render the lighting for tiles, I simply set their vertices colors to the corresponding light map color. For example: [code] Dim i As Long Dim j As Long Dim Verts(0 To 3) As TLVERTEX For i = LeftMost_ScreenTile To RightMost_ScreenTile Step 1 For j = TopMost_ScreenTile To BottomMost_ScreenTile Step 1 InitLitGeom(Verts, i * TILE_SIZE, j * TILE_SIZE, width, height, srcwidth, srcheight, ......., Color0:=LightMap.Nodes(i,j,0).Color, Color1:=LightMap.Nodes(i,j,1).Color, Color2:=LightMap.Nodes(i,j,2).Color, Color3:=LightMap.Nodes(i,j,3).Color, ... specular colors if needed) RenderVerts(Verts) Next j Next i [/code] Anyone familiar with tile-based gaming shouldn't have too much trouble following this code. It loops through all tiles visible on the screen and initializes their vertices and colorizes them according to the light map. Pretty simple. Perhaps I should just create a sample project and put it up for people to examine. I hope this clears things up. Lighting isn't difficult mathematically, just logically. [:)]
Eric ColemanIn reference to that last code section, are you updating your vertices every frame?
ThresholdAre you refering to the vertices used to render tiles? Yes, though I probably shouldn't. Using a similar "map" for that also would probably be faster. In fact, that gives me an idea. I could create a map of all tile vertices and then the lighting would merely use that instead of having its own. That would cause a few difficulties here and there, but nothing I can't handle. I'll see what I can do with that. Nevertheless, I doubt this new idea is going to speed up lighting as drastically as needed, but it may. Thanks! Any further feedback is appreciated.
Eric ColemanWell, I'm sure most of the maps lights are static from frame to frame, so there really is no need to update the entire map. When a light changes, make a note of it's size and the square that it influences. Those square are your "dirty rectangles." When updating your vertices, only update in the designated areas. Also, I'm not sure why you need to erase the lightmap or resize it. Why and when do you do that?
ThresholdBy "erase" and "resize," I basically mean the same thing in this context. Consider... [code] Dim byteArr() As Byte ' byteArr is filled, used, etc.... Erase byteArr ' Clear the array from memory [/code] vs. [code] Dim byteArr() As Byte ' byteArr is filled, used, etc.... ' Note there is no Preserve keyword (e.g. Redim Preserve byteArr...) Redim byteArr(0 To MaxValue) [/code] In both cases, byteArr's information is cleared. The only difference is that in the first, it is completely destroyed, and the second, it is resized without preservation of its data. When no static lights are present, or the memory light map has not yet been retained, I clear the light maps by redimming them to their original size. This doesn't actually change their size, but it does clear their data. The problem here is not that I can't calculate the lighting, it's that the memory light map is so big, that copying it takes too long. Given, it DOES copy the entire memory light map to the actual light map whenever a dynamic light changes. This is really unnecessary, and it goes slow on large maps. Is it possible to use an API or some equivalent to copy only a certain portion of a large array? If so, I could make the light map the size of the viewport (target/rendering window) and make the memory light map the size of the game map and calculate static lights as they appear on the screen and only copy the portion of the memory light map that is on the screen to the actual light map which would be drastically smaller in size. Perhaps a visual aide is in order... [code] 'Memory light map is same size as "game" map (grass, walls, floors, etc.) O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O 'O = tile, each tile has 4 vertices (only 1 is unique on most tiles) 'This 10x10 memory light map would be calculated as the viewport (below) moves in it 'ViewPort (say the screen can only show 3x3 tiles), X = visible tile on the screen O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O X X X O O O O O O O X X X O O O O O O O X X X O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O 'I could make the actual light map (the light map used for rendering) the same size as the screen which would be constant (though the game map is also, but the screen will usually be much smaller than the entire game map). 'As the viewport moves, the memory light map is updated (1 time only, if the screen moves back, the memory light map would already be calculated for that area) [/code] The only dilemma I face with this proposal is that I need a quick, efficient way of copying a portion of the memory light map array to the viewport light map array. Looping through would work, but I would imagine it'd be too slow. Thanks, I think we're starting to get somewhere.
Eric ColemanYou didn't answer my questions. I didn't ask how do you redim or erase, I asked why and when. When are you erasing and/or redimming the light map array? Why are you erasing and/or redimming the light map array?
ThresholdThe answer to why, as stated in my previous post, is to clear the array of data and set all values back to 0. The answer to when is everytime the lightmap needs to be cleared to recalculate the visible portion of light map. This would be any time that a dynamic light that is visible on the screen changes. Actually, if there are any static lights used, the light map is only cleared once: at the very beginning of the level/game/etc. This is because, with the presence of static lights, the light map is never "cleared" completely again; it is reset to the default lighting which is stored in the memory light map (all the static lights). This method may change if I can find a fast way to copy a portion of an array to another array. Any ideas?
Eric ColemanOk, I think I understand things a bit better now. First, there is no need to resize the array to clear it. You can use the methods I described earlier to easily "remove" lights from the light map instead of having to recalculate everything. Consider these three stages of a hypothetical light map. The white light is static, and the blue light is dymanic. The blue light can be subtracted from the light map, then it's properties can be adjusted, such as color, size, position, and then it can be added back to the map. [img] coleman/2004126133832_untitled.gif[/img] Since you know the size of the light, you would limit your changes to the portion of the lightmap that's actually effected by the light.
ThresholdI now see what you were talking about when introducing the concept of subtracting light influence. I'll bet that just might work fine. Now I don't even need to clear the light map. I think all the concepts here could be combined to make the lighting engine lightning fast. Thanks! [:(] This is going to force me almost recode my engine. It is for a good cause.
Eric ColemanChange your light UDT to something like [code] R As Single G as Single B As Single Count As Long [/code] When adding lights to the light map, don't average the values. [code] With LM(0,1) .R = .R + R_component .G = .G + G_Component ... .Count = .Count + 1 End With [/code] Depending on how you store your light information, use LONGs for the RGB components if you use 0 to 255, or Singles for 0 to 1. I prefer using Singles, but that's just me. Using this method you should get values greater than the max value for colors. For example, if a Blue and White light effect the same pixel, then the light map would have R = 1, G = 1, B =2, Count = 2. When updating your vertex data, only update the vertices that you've changed since last frame. Alternatively, you could opt to only update vertices that are currently visible on the screen. Either method would work. The idea is to only update when necessary and only where needed. Back to the light's out of range color. When updating your vertices you simply clamp the value to what's visible, 0 to 1 or 0 to 255 depending on your data type. Don't clamp the light map, just clamp what you assign to the vertices. The reason for keeping the high values is for subtraction purposes. Given the color before, (1,1,2), subtracting white (1,1,1) would leave (0,0,1), which is blue. Inversely, subtracting blue(0,0,1) from the color(1,1,2) would leave white(1,1,1). Simple, right? This works for a very high number of lights. Using a LONG data type to store the net light value of each color component you can get around 8 million lights affecting a single vertex.
ThresholdOnly 8 million? I hate limitation! So the key is not to limit the stored color values to 0 to 255 (which is what I prefer, I must admit, just because I'm more used to it). But when rendering (creating the color with D3DColorRGBA), how would I achieve the actual color to be used on the vertex. I imagine I could just use something like [code] ColorToUse = D3DColorRGBA(IIf(R > 255, 255, R), IIf(G > 255, 255, G), IIf(B > 255, 255, B), 255) [/code] Am I correct? Or I may have yet some misconceptions regarding this method.
Eric ColemanYes, you would clamp the colors. However, IIf is a bit slower than an If .. Then .. Else statment, and since you're looking for speed improvments, so these 4 lines if r > 255 then r2 = 255 else r2 = r if g > 255 then g2 = 255 else g2 = g if b > 255 then b2 = 255 else b2 = b colortouse = d3dcolorrgba(r2,g2,b2,255) are about twice as fast as the single line with IIf statements.
ThresholdThanks! I will post my results and hope to soon make this lighting engine available to all.