Commit Graph

1985 Commits

Author SHA1 Message Date
Billy Laws
a940d6fd34 Update submodules 2022-11-02 17:46:07 +00:00
Billy Laws
026bb04386 Impl some more texture formats 2022-11-02 17:46:07 +00:00
Billy Laws
133f08ed14 Stash new register value before executing deferred draws/updates
Since the register writes technically happen after the draw, issues can occur if they happen before: e.g. skyrim updates ctSelect and disables all RTs after a draw, but this would happen before it previously and crash the driver.
2022-11-02 17:46:07 +00:00
Billy Laws
c50852e546 Implement the draw(...)BeginEnd Maxwell3D draw registers
Used by guest Vulkan games and nouveau.
2022-11-02 17:46:07 +00:00
Billy Laws
270ef3e0d2 Implement GPFIFO semaphore acquire operations 2022-11-02 17:46:07 +00:00
Billy Laws
2ce146e28f Don't crash on the Grp0SetSubDevMask TertOp
Used by Vulkan games to set the SLI mask, not applicable to the switch.
2022-11-02 17:46:07 +00:00
Billy Laws
1d83dadefb Drop size restruction bypass for frequently synced buffers
In cases where large buffers are updated every draw this could seriously increase memory usage beyond 3GB in the megabuffer.
2022-11-02 17:46:07 +00:00
Billy Laws
1088ed514c Introduce texture usage system to ensure RPs are split when necessary
Vulkan doesn't allow sampling a texture and using it as an RT in the same RP, by tracking the texture usage status and splitting RPs when this occurs we can avoid such potential sync errors.
2022-11-02 17:46:07 +00:00
Billy Laws
2dd4698441 Adjust texture matching hacks 2022-11-02 17:46:07 +00:00
Billy Laws
4f5c9047ef Add some additional texture formats used by Vulkan games 2022-11-02 17:46:07 +00:00
Billy Laws
6a830dfac5 Use shader-compiler side {S,U}Scaled format emulation 2022-11-02 17:46:07 +00:00
Billy Laws
579fd04117 Fixup ReadTextureType shader compiler callback 2022-11-02 17:46:07 +00:00
Billy Laws
b04d18eba5 Add support for split mappings to I2M uploads
Used by Super Mario Sunshine and other Vulkan games.
2022-11-02 17:46:07 +00:00
Billy Laws
db5e208379 Clear images even when aspects mismatch 2022-11-02 17:46:07 +00:00
Billy Laws
3c8df327f1 Fixup subpass barriers and flags 2022-11-02 17:46:07 +00:00
Billy Laws
5ab80901c6 Drop some debug code 2022-11-02 17:46:07 +00:00
Billy Laws
4de89c8839 GPU NEW MARGEBAC 2022-11-02 17:46:07 +00:00
Billy Laws
7670c83405 Ensure textures are clean before paging them out 2022-11-02 17:46:07 +00:00
Billy Laws
1a2351386d Add u64 iova ctor 2022-11-02 17:46:07 +00:00
Billy Laws
93d43e0115 Fully fill in swizzle component mappings
Avoids the rest being default initialised to identity, which would break the intended effect of them.
2022-11-02 17:46:07 +00:00
Billy Laws
37ff0ab814 Add buffer manager support for accelerated copies
These will be sequenced on the GPU/CPU depending on what's optimal and avoid any serialisation
2022-11-02 17:46:07 +00:00
Billy Laws
cac287d9fd Implement accelerated uploads/copies through buffer manager
Previously, both I2M uploads and DMA copies would force GPU serialisation if they happened to hit a trap or were used to copy GPU dirty buffers. By using the buffer manager to implement them on the host GPU we can avoid such slowdowns entiely.
2022-11-02 17:46:07 +00:00
Billy Laws
c5ec484d9a Avoid redundantly passing executor in ctors when it's already in ChannelCtx 2022-11-02 17:46:07 +00:00
Billy Laws
463394ba72 Pass correct size for XFB buffers 2022-11-02 17:46:07 +00:00
Billy Laws
bd976676f4 Fix SNorm vertex formats 2022-11-02 17:46:07 +00:00
Billy Laws
b74098570f Zero-out unused XFB varyings before passing to hades 2022-11-02 17:46:07 +00:00
Billy Laws
22f3ba6b93 Mark XFB buffers as GPU dirty 2022-11-02 17:46:07 +00:00
Billy Laws
26aeeaecf5 Add constant buffer GPU write pipeline barrier 2022-11-02 17:46:07 +00:00
Billy Laws
0b5d9308c4 Be more careful about potentially-unneeded GPU->CPU syncs
These can be especially expensive so should be avoided as much as possible.
2022-11-02 17:46:07 +00:00
Billy Laws
e6530e2386 Delete graphics_context
F
2022-11-02 17:46:07 +00:00
Billy Laws
ac2e6c125b Switch to Roboto for Korean font 2022-11-02 17:46:07 +00:00
Billy Laws
b24a8465da Don't require depthClamp 2022-11-02 17:46:07 +00:00
Billy Laws
9055c98e09 Only enable debug/verbose logs in (rel)debug builds 2022-11-02 17:46:07 +00:00
Billy Laws
0ebdbcf0ff Don't lock stateMutex when updating buffer cycle 2022-11-02 17:46:07 +00:00
Billy Laws
dd360b8f75 Pass correct wait semaphore array size to queue submit 2022-11-02 17:46:07 +00:00
Billy Laws
c78a4b9699 Fixup buffer recreation to avoid deadlock when waiting on srcs 2022-11-02 17:46:07 +00:00
Billy Laws
d236bfe454 Enable depthClamp VK device feature 2022-11-02 17:46:07 +00:00
Billy Laws
95d849e1f6 Check FenceCycle signalled flag immediately before waiting
The lock release within the wait for submission means that another thread could end up signalling the cycle and then the VK wait still happen after when the lock has been reacquired.
2022-11-02 17:46:07 +00:00
Billy Laws
1a23b929a7 Avoid chaining cycles in buffer recreation
This had a chance of creating circular chains which obviously caused issues, just do a wait instead for now.
2022-11-02 17:46:07 +00:00
Billy Laws
a15db9cb06 Update hades submodule 2022-11-02 17:46:07 +00:00
Billy Laws
cfc55e60b0 Add robin map submodule 2022-11-02 17:46:07 +00:00
Billy Laws
6c0f084aae Introduce hack to ignore frequently read-back textures
Readback can be especially slow on mobile due to the varying load pattern it creates which often prevents the CPU/GPU from clocking up. Since some games perform texture readback but don't actually use it for anything significant implement a hack to skip it and significantly improve performance in such cases.
2022-11-02 17:46:07 +00:00
Billy Laws
e45e7546c8 Redesign buffer megabuffering
Due to the frequency at which is is called megabuffering performance is critical to the performance of the entire emulator, especially in high-drawcall-count scenarios. After the view redesign, megabuffering on a per-view level was no longer possible nor desirable, and thus megabuffering was modified to just copy for every usage of a view. This worked great at the time since there were other bottlenecks, however gpu-new has since removed almost all of them and megabuffering is now a major sore point. Fix this by megabuffering small chunks and storing them in a page-table like structure within the buffer, these chunks can be referenced by multiple views and will be smartly invalidated whenever the sequence number or execution number changes to avoid any sequencing issues. In addition to this, to help the case where almost the whole buffer is read every single frame across a set of multiple views, an optimisation to skip the chunked tracking and use one large single megabuffer allocation and one single memcpy has been introduced. This reduces the overall amount of time spent in memcpy since large memcpys are quicker.
2022-11-02 17:46:07 +00:00
Billy Laws
7ea9aa52f5 Speed up reported guest GPU time
Avoids triggering DRS in games in cases where it wouldn't actually benefit anything due to being CPU bottlenecked.
2022-11-02 17:46:07 +00:00
Billy Laws
31c2fb7d7a Fixup IDirectory read 2022-11-02 17:46:07 +00:00
Billy Laws
7491178a9e Pass base array layer to texture views 2022-11-02 17:46:07 +00:00
Billy Laws
ff57d2fbbf Enforce stronger format and weaker dimension texture compat checks
Rather than using just bpb for format compat, additionally check that the exact component bit layout matches since many games end up reusing RTs for unrelated textures. The texture size requirements have also been weaked to only check the resulting layer size as opposed to width/height - this is somewhat hacky but it gets around the problem of blocklinear alignment.
2022-11-02 17:46:07 +00:00
Billy Laws
14af383238 Only allow submitting swapchainImageCount images for host present at a time
Prevents situations where nothing would otherwise be waiting on the GPU and since presentation no longer blocks too many images would be submitted for presentation.
2022-11-02 17:46:07 +00:00
Billy Laws
bcd96ac77d Fixup A8R8G8B8 TIC format mapping
8-bit formats are inverted in TICs compared to Vulkan
2022-11-02 17:46:07 +00:00
Billy Laws
90466b8830 Implement depth clamp rasterisation state
Used in SMO for shadows.
2022-11-02 17:46:07 +00:00